5,995+ open-access research outputs.
Medical retrieval-augmented generation (RAG) systems typically operate on text chunks extracted from biomedical literature, discarding the rich visual content (tables, figures, structured layouts) of …
Large language models (LLMs) are commonly evaluated for political bias based on their responses to fixed questionnaires, which typically place frontier models on the political left. A parallel literat…
We adapted the Reliable Change Index (RCI; Jacobson and Truax, 1991) from clinical psychology to item-level LLM version comparison on 2,000 MMLU-Pro items (K=10 samples at T=0.7). Two within-family pa…
Buchbinder and Feldman recently gave a deterministic $(1-1/e-\varepsilon)$-approximation for maximizing a non-negative monotone submodular function subject to a matroid constraint, with query complexi…
When instructed to underperform on multiple-choice evaluations, do language models engage with question content or fall back on positional shortcuts? We map the boundary between these regimes using a …
We prove that the directed five-dimensional torus $D_5(m) = \operatorname{Cay}((\mathbb{Z}_m)^5, \{e_0, e_1, e_2, e_3, e_4\})$ has a Hamilton decomposition for every odd integer $m \geq 3$. This is th…
Hybrid quantum--classical workflows often execute large ensembles of circuits that differ syntactically but implement identical operations, leading to substantial redundant computation. To address thi…
A predecessor pilot (Cacioli, 2026) found that Llama-3-8B implements prompted sandbagging as positional collapse rather than answer avoidance. However, fixed option ordering in MMLU-Pro left open whet…
This is Paper 6 of a series of formally-verified analyses of masked NTT hardware for post-quantum cryptography; Paper 1 [1] established structural dependency analysis of the QANARY platform, and Paper…
Reinforcement learning with verifiable rewards (RLVR) enhances the reasoning of large language models (LLMs), but standard RLVR often depends on human-annotated answers or carefully curated reward spe…
Detecting sandbagging--the deliberate underperformance on capability evaluations--is an open problem in AI safety. We tested whether symptom validity testing (SVT) logic from clinical malingering dete…
We study Chebyshev quotients that arise in the representation theory of Lie algebras, specifically within the theory of Demazure flags for fusion products of $\mathfrak{sl}_2[t]$-modules. Motivated by…
Coherent quantum rollout for sequential decision problems requires a unitary simulator: randomness must live in explicit quantum registers, and basis-state selectors must be mapped to actions reversib…
Recent work revisiting measurability in the fundamental theorem of statistical learning imposes Borel measurability of ghost-gap suprema. We show that, at the one-sided ghost-gap interface actually us…
Barrett reduction is the nonlinear core of every practical NTT-based post-quantum cryptography implementation. Existing composition frameworks (ISW, t-SNI, PINI, DOM) address Boolean masking over GF(2…
The rapid integration of Large Language Models (LLMs) into Multi-Agent Systems (MAS) has significantly enhanced their collaborative problem-solving capabilities, but it has also expanded their attack …
Retrieving or generating two-dimensional molecular structures on the basis of vibrational spectra has been well demonstrated via deep learning models. However, deciphering three-dimensional molecular …
Kinetic models based on the Bhatnagar-Gross-Krook (BGK) framework provide an efficient alternative to the Boltzmann equation for rarefied gas flows; however, existing formulations for gas mixtures rem…
We present ZFLean, a Lean 4 library for doing core mathematics inside a model of ZFC with the ergonomics expected of typed Mathlib developments. Building on Mathlib's ZFC model, we contribute a relati…
Current Vision-Language-Action (VLA) models predominantly rely on end-to-end fine-tuning. While effective, this paradigm compromises the inherent generalization capabilities of Vision-Language Models …
Free open-access publishing with Google Scholar indexing.
Submission Guide →