40,061+ open-access research outputs.
LLM agents are expected to complete end-to-end units of work across software tools, business services, and local workspaces. Yet many agent benchmarks freeze a curated task set at release time and gra…
The bond-valence model is a standard way to estimate bond strengths in crystals, but its exponential dependence on bond length has lacked a derivation from a specific physical interaction. We show tha…
Rule-based systems remain central in safety-critical domains but often struggle with scalability, brittleness, and goal misspecification. These limitations can lead to reward hacking and failures in f…
Text-to-SQL (T2SQL) evaluation in production environments poses fundamental challenges that existing benchmarks do not address. Current evaluation methodologies whether rule-based SQL matching or sche…
In a recent preprint (Mosegaard and Curtis, 2024, arXiv:2411.13570v2) we analyzed the consequences of ignoring the well-known inconsistency of classical conditional probability densities. We explained…
We study semileptonic sum rules for $b \to c \tau \overline{\nu}$ transitions involving orbitally excited charm hadrons. Starting from the amplitude-level relation implied by the heavy quark symmetry,…
Joint image compression and wireless transmission remain relatively underexplored compared to generic image restoration, despite its importance in practical communication systems. We formulate this pr…
Large Language Models (LLMs) have rapidly improved in performance across code-related tasks, making their integration into Register Transfer Level (RTL) development increasingly attractive. Mimicking …
Integrating domain knowledge into deep neural networks is a promising way to improve generalization. Existing methods either encode prior knowledge in the loss function or apply post-processing module…
This work studies numerical integration by the M\"obius-transformed trapezoidal rule, which combines the classical trapezoidal rule with a change of variables induced by a M\"obius transformation that…
Arabidopsis roots show oscillatory growth patterns on homogeneous agar surfaces, whereas other plants, such as maize, do not. Although several explanations have been proposed, a simple and general mod…
Integrated Circuit (IC) verification consumes nearly 70% of the IC development cycle, and recent research leverages Large Language Models (LLMs) to automatically generate testbenches and reduce verifi…
We present WaferSAGE, a framework for wafer defect visual question answering using small vision-language models. To address data scarcity in semiconductor manufacturing, we propose a three-stage synth…
We establish an It\^o-type formula for finite $p$-variation paths with jumps for arbitrary $p\geq 1$. The formula is stated in a fully pathwise form and separates the reduced rough integral from expli…
We propose a novel mechanism to explain the naturally small vacuum expectation values (VEVs) of exotic multi-Higgs fields by employing non-invertible symmetries. Specifically, we introduce an $SU(2)_L…
Policy gradient methods are reinforcement learning algorithms that adapt a parameterized policy by following a performance gradient estimate. Conventional policy gradient methods use Monte-Carlo techn…
Local fine-tuning datasets routinely contain sensitive secrets such as API keys, personal identifiers, and financial records. Although ''local offline fine-tuning'' is often viewed as a privacy bounda…
This paper is a continuation work of Ren et al. (2026) aiming to further devise q-learning algorithms for mean-field control (MFC) with controlled common noise. Based on the relaxed control formulatio…
As LLMs become credible readers of earnings calls, investor-relations Q\&A, guidance, and disclosure language, supervised financial NLP benchmarks increasingly function as decision evidence for model …
Compositional generalization tests are often used to estimate the compositionality of LLMs. However, such tests have the following limitations: (1) they only focus on the output results without consid…
Free open-access publishing with Google Scholar indexing.
Submission Guide →