5,754+ open-access research outputs.
Vision-Language-Action (VLA) models have increasingly incorporated reasoning mechanisms for complex robotic manipulation. However, existing approaches share a critical limitation: whether employing ex…
We consider smooth convex minimization over compact convex sets, i.e., $\min_{x \in C} f(x)$ with the (vanilla) Frank-Wolfe algorithm. Well-known lower bounds establish a worst-case $\Omega(1/t)$ prim…
Vision-Language-Action (VLA) models advance robotic control via strong visual-linguistic priors. However, existing VLAs predominantly frame pretraining as supervised behavior cloning, overlooking the …
Audio-based stuttering systems to date have been trained for detection -- what disfluency is present now -- leaving prediction, the capability needed for closed-loop intervention, unstudied at deploya…
3D Gaussian Splatting (3DGS) achieves high-quality novel view synthesis with real-time rendering, but its storage cost remains prohibitive for practical deployment. Existing post-training compression …
While large models demonstrate the strong representational power of vanilla attention, this core mechanism cannot be directly applied to Dense Object Tracking: its quadratic all-to-all interactions ar…
Video conferencing over 5G is increasingly prevalent, yet its Quality of Experience (QoE) often degrades under limited radio resources. This has two causes: 5G networks must serve many users, while in…
We introduce the Local Occupied Volatility (LOV) model that sits between Dupire's local volatility and fully path-dependent dynamics. By design, the LOV model ensures automatic calibration to European…
Masked diffusion models (MDMs) generate discrete sequences by iterative denoising under an absorbing masking process. In standard masked diffusion, if a token remains masked after a reverse update, th…
Commercial TTS systems produce near-native Indic audio, but the best open-source bases (Chatterbox, Indic Parler-TTS, IndicF5) trail them on measured phonological dimensions, and the most widely adopt…
A multilingual collection may contain useful knowledge in other languages to supplement and correct the facts in the original language for Retrieval-Augmented Generation (RAG). However, the vanilla ap…
Backward stochastic differential equation (BSDE) provides probabilistic solutions for a class of parabolic partial differential equations (PDEs). DeepBSDE and FBSNN are two deep learning approaches fo…
While the optimal sample complexity of binary classification in terms of the VC dimension is well-established, determining the optimal sample complexity of multiclass classification has remained open.…
As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…
Fast Adversarial Training (FAT) has attracted significant attention due to its efficiency in enhancing neural network robustness against adversarial attacks. However, FAT is prone to catastrophic over…
We present ZFLean, a Lean 4 library for doing core mathematics inside a model of ZFC with the ergonomics expected of typed Mathlib developments. Building on Mathlib's ZFC model, we contribute a relati…
Create an idea, prototype it, evaluate if users like it, then learn. It is the circle of business. If AI can operate in all parts of the circle, it will enable rapid iteration and learning speeds for …
On-policy distillation (OPD) has shown strong potential for transferring reasoning ability from frontier or domain-specific models to smaller students. While effective on static single-turn tasks, its…
In this paper, we introduce the Quasi-Quadratic Gradient (QQG), a novel search direction designed to accelerate the BFGS method within the quasi-Newton framework. By defining the QQG as the product of…
Hypernetwork-based methods such as Doc-to-LoRA internalize a document into an LLM's weights in a single forward pass, but they fail systematically on conflicts: when the document contradicts pretraini…
Free open-access publishing with Google Scholar indexing.
Submission Guide →