7,402+ open-access research outputs.
Many agents in real-world environments cannot reliably communicate their goals through language, including household pets, pre-verbal infants, and other non-speaking embodied agents. In such settings,…
Computer-use agents provide a promising path toward general software automation because they can interact directly with arbitrary graphical user interfaces instead of relying on brittle, application-s…
Multimodal large language models (MLLMs) have achieved impressive progress on general multimodal tasks, yet they remain brittle on dial-based measurement reading. In this paper, we study this problem …
Graph Contrastive Learning (GCL) has emerged as a prominent framework for unsupervised graph representation learning. However, relying on augmentation design alone to define the invariances learned by…
Video large language models (VideoLLMs) are increasingly trained or instruction-tuned on large-scale video--text corpora collected from heterogeneous sources, raising an immediate privacy question: ca…
While Vision-Language Models (VLMs) have achieved state-of-the-art performance in general visual tasks, their perceptual robustness remains remarkably brittle when confronted with optical illusions. T…
Treatment allocation under budget constraints is a central challenge in digital advertising: advertisers must decide which users to show ads to while spending a limited budget wisely. The standard app…
We give a criterion for certain generic nondegenerate surfaces in a fake weighted projective $3$-space to have Picard number $>1$. These algebraic surfaces are of general type. We do this by consideri…
We introduce a linear map on symmetric functions that 'divides' a partition by a positive integer $k$, sending a Schur function indexed by a partition of $kn$ to a symmetric function indexed by partit…
We study backward stochastic differential equations (BSDEs) in infinite horizon and design efficient numerical schemes for solving them. We establish a probabilistic representation of the solution of …
Detecting hate speech in memes is challenging due to their multimodal nature and subtle, culturally grounded cues such as sarcasm and context. While recent vision-language models (VLMs) enable joint r…
Constraint-based causal discovery is brittle in finite-sample regimes because erroneous conditional-independence (CI) decisions can cascade into substantial structural errors. We propose Quantitative …
Privacy-critical domains require phishing detection systems that satisfy contradictory constraints: near-zero false positives to prevent workflow disruption, transparent explanations for non-expert st…
We develop a unified T-extended framework for weakly contractive, weakly Kannan, and Geraghty classes of self-maps S on a metric space (X, d), where distances are measured on the auxiliary image via d…
Generative retrieval (GR) ranks documents by autoregressively generating document identifiers. Because many GR methods rely on trie-constrained beam search, they are vulnerable to early pruning of rel…
The authors previously formulated the hybrid conjecture, unifying Andr\'e-Pink-Zannier and Andr\'e-Oort conjectures, and proved it in Shimura varieties of abelian type. We study its analogue for mixed…
Cross-domain task-oriented dialogue requires reasoning over implicit and explicit feasibility constraints while planning long-horizon, multi-turn actions. Large language models (LLMs) can infer such c…
Web-scale 3D asset collections are abundant, but rarely deployment-ready. Assets ship with arbitrary metric scale, incorrect pivots and forward axes, brittle geometry, and textures that do not support…
Blockchain wallets conventionally follow an ownership model where possession of a private key grants unilateral control. However, this assumption is brittle for emerging settings such as AI agent wall…
Vision-Language-Action (VLA) models promise generalist robot manipulation, but are typically trained and deployed as short-horizon policies that assume the latest observation is sufficient for action …
Free open-access publishing with Google Scholar indexing.
Submission Guide →