David Ruhe — Research Repository

Computer Science Preprint PDF DOI

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Chenxin Li, Zhengyang Tang, Huangxin Lin, Yunlong Lin, Shijue Huang, Shengyuan Liu, Bowen Ye, Rang Li, Lei Li, Benyou Wang, Yixuan Yuan · 2026

LLM agents are expected to complete end-to-end units of work across software tools, business services, and local workspaces. Yet many agent benchmarks freeze a curated task set at release time and gra…

Read Paper →

Physics Preprint PDF DOI

Yukawa screening derivation of the bond-valence rule

Michael L. Whittaker, Pan Wang, Chunhui Li, Naman Katyal, Piotr Zarzycki · 2026

The bond-valence model is a standard way to estimate bond strengths in crystals, but its exponential dependence on bond length has lacked a derivation from a specific physical interaction. We show tha…

Read Paper →

Computer Science Preprint PDF DOI

Towards Neuro-symbolic Causal Rule Synthesis, Verification, and Evaluation Grounded in Legal and Safety Principles

Zainab Rehan, Christian Medeiros Adriano, Sona Ghahremani, Holger Giese · 2026

Rule-based systems remain central in safety-critical domains but often struggle with scalability, brittleness, and goal misspecification. These limitations can lead to reward hacking and failures in f…

Read Paper →

AI & Data Science Preprint PDF DOI

Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems

Taslim Jamal Arif, Kuldeep Singh · 2026

Text-to-SQL (T2SQL) evaluation in production environments poses fundamental challenges that existing benchmarks do not address. Current evaluation methodologies whether rule-based SQL matching or sche…

Read Paper →

AI & Data Science Preprint PDF DOI

Response to: "A note on conditional densities, Bayes' rule, and recent criticisms of Bayesian inference" by Yan et al., 2026

Klaus Mosegaard, Andrew Curtis · 2026

In a recent preprint (Mosegaard and Curtis, 2024, arXiv:2411.13570v2) we analyzed the consequences of ignoring the well-known inconsistency of classical conditional probability densities. We explained…

Read Paper →

Physics Preprint PDF DOI

$b \to c$ semileptonic sum rule: orbitally excited hadrons

Motoi Endo, Syuhei Iguro, Satoshi Mishima · 2026

We study semileptonic sum rules for $b \to c \tau \overline{\nu}$ transitions involving orbitally excited charm hadrons. Starting from the amplitude-level relation implied by the heavy quark symmetry,…

Read Paper →

Engineering Preprint PDF DOI

Diffusion-OAMP for Joint Image Compression and Wireless Transmission

Wentao Hou, Yimin Bai, Zelei Luo, Jiadong Hong, Lei Liu · 2026

Joint image compression and wireless transmission remain relatively underexplored compared to generic image restoration, despite its importance in practical communication systems. We formulate this pr…

Read Paper →

Computer Science Preprint PDF DOI

RuC: HDL-Agnostic Rule Completion Benchmark Generation

Arnau Ayguade Domingo, Miquel Alberti-Binimelis, Cristian Gutierrez-Gomez, Emanuele Parisi, Razine Moundir Ghorab, Miquel Moreto, Gokcen Kestor, Dario Garcia-Gasulla · 2026

Large Language Models (LLMs) have rapidly improved in performance across code-related tasks, making their integration into Register Transfer Level (RTL) development increasingly attractive. Mimicking …

Read Paper →

AI & Data Science Preprint PDF DOI

Learning to Reason: Targeted Knowledge Discovery and Fuzzy Logic Update for Robust Image Recognition

Gurucharan Srinivas, Joshua Niemeijer, Frank Koster · 2026

Integrating domain knowledge into deep neural networks is a promising way to improve generalization. Existing methods either encode prior knowledge in the loss function or apply post-processing module…

Read Paper →

Mathematics Preprint PDF DOI

M\"obius-transformed trapezoidal rule for polynomial weights

Nuutti Hyvonen, Yuya Suzuki · 2026

This work studies numerical integration by the M\"obius-transformed trapezoidal rule, which combines the classical trapezoidal rule with a change of variables induced by a M\"obius transformation that…

Read Paper →

Biology & Life Sciences Preprint PDF DOI

Delayed control driven oscillations in plant roots

Riz Fernando Noronha, Kunihiko Kaneko, Koichi Fujimoto · 2026

Arabidopsis roots show oscillatory growth patterns on homogeneous agar surfaces, whereas other plants, such as maize, do not. Although several explanations have been proposed, a simple and general mod…

Read Paper →

Computer Science Preprint PDF DOI

HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs

Chang-Chih Meng, Yu-Ren Lu, Guan-Yu Lin, Tsung Tai Yeh, Kai-Chiang Wu, I-Chen Wu · 2026

Integrated Circuit (IC) verification consumes nearly 70% of the IC development cycle, and recent research leverages Large Language Models (LLMs) to automatically generate testbenches and reduce verifi…

Read Paper →

AI & Data Science Preprint PDF DOI

WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning

Ke Xu · 2026

We present WaferSAGE, a framework for wafer defect visual question answering using small vision-language models. To address data scarcity in semiconductor manufacturing, we propose a three-stage synth…

Read Paper →

Mathematics Preprint PDF DOI

Jump It\^o-type formula with arbitrary regularity

Nannan Li, Xing Gao · 2026

We establish an It\^o-type formula for finite $p$-variation paths with jumps for arbitrary $p\geq 1$. The formula is stated in a fully pathwise form and separates the reduced rough integral from expli…

Read Paper →

Physics Preprint PDF DOI

A theoretical account of tiny multi-Higgs vacuum expectation values from non-invertible symmetry

Takaaki Nomura, Hiroshi Okada · 2026

We propose a novel mechanism to explain the naturally small vacuum expectation values (VEVs) of exotic multi-Higgs fields by employing non-invertible symmetries. Specifically, we introduce an $SU(2)_L…

Read Paper →

AI & Data Science Preprint PDF DOI

Bayesian policy gradient and actor-critic algorithms

Mohammad Ghavamzadeh, Yaakov Engel, Michal Valko · 2026

Policy gradient methods are reinforcement learning algorithms that adapt a parameterized policy by following a performance gradient estimate. Conventional policy gradient methods use Monte-Carlo techn…

Read Paper →

Computer Science Preprint PDF DOI

Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors

Zi Li, Tian Zhou, Wenze Li, Jingyu Hua, Yunlong Mao, Sheng Zhong · 2026

Local fine-tuning datasets routinely contain sensitive secrets such as API keys, personal identifiers, and financial records. Although ''local offline fine-tuning'' is often viewed as a privacy bounda…

Read Paper →

Mathematics Preprint PDF DOI

Continuous-time q-learning for mean-field control with common noise, part-II: q-learning algorithms

Zhenjie Ren, Xiaoli Wei, Xiang Yu, Xun Yu Zhou · 2026

This paper is a continuation work of Ren et al. (2026) aiming to further devise q-learning algorithms for mean-field control (MFC) with controlled common noise. Based on the relaxed control formulatio…

Read Paper →

AI & Data Science Preprint PDF DOI

Measurement Risk in Supervised Financial NLP: Rubric and Metric Sensitivity on JF-ICR

Sidi Chang, Peiying Zhu, Yuxiao Chen, Rongdong Chai · 2026

As LLMs become credible readers of earnings calls, investor-relations Q\&A, guidance, and disclosure language, supervised financial NLP benchmarks increasingly function as decision evidence for model …

Read Paper →

AI & Data Science Preprint PDF DOI

Investigating More Explainable and Partition-Free Compositionality Estimation for LLMs: A Rule-Generation Perspective

Ziyao Xu, Cong Wang, Houfeng Wang · 2026

Compositional generalization tests are often used to estimate the compositionality of LLMs. However, such tests have the following limitations: (1) they only focus on the output results without consid…

Read Paper →

Browse Research Papers

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Yukawa screening derivation of the bond-valence rule

Towards Neuro-symbolic Causal Rule Synthesis, Verification, and Evaluation Grounded in Legal and Safety Principles

Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems

Response to: "A note on conditional densities, Bayes' rule, and recent criticisms of Bayesian inference" by Yan et al., 2026

$b \to c$ semileptonic sum rule: orbitally excited hadrons

Diffusion-OAMP for Joint Image Compression and Wireless Transmission

RuC: HDL-Agnostic Rule Completion Benchmark Generation

Learning to Reason: Targeted Knowledge Discovery and Fuzzy Logic Update for Robust Image Recognition

M\"obius-transformed trapezoidal rule for polynomial weights

Delayed control driven oscillations in plant roots

HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs

WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning

Jump It\^o-type formula with arbitrary regularity

A theoretical account of tiny multi-Higgs vacuum expectation values from non-invertible symmetry

Bayesian policy gradient and actor-critic algorithms

Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors

Continuous-time q-learning for mean-field control with common noise, part-II: q-learning algorithms

Measurement Risk in Supervised Financial NLP: Rubric and Metric Sensitivity on JF-ICR

Investigating More Explainable and Partition-Free Compositionality Estimation for LLMs: A Rule-Generation Perspective

Browse by Category

Research Type

Publish Your Research