Falk Eilenberger — Research Repository

Computer Science Preprint PDF DOI

Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes

Tianyuan Wu, Chaokun Chang, Lunxi Cao, Wei Gao, Wei Wang · 2026

Autonomous agents act through sandboxed containers and microVMs whose state spans filesystems, processes, and runtime artifacts. Checkpoint and restore (C/R) of this state is needed for fault toleranc…

Read Paper →

AI & Data Science Preprint PDF DOI

Focus Session: Autonomous Systems Dependability in the era of AI: Design Challenges in Safety, Security, Reliability and Certification

Behnaz Ranjbar, Kirankumar Raveendiran, Sudeep Pasricha, Samarjit Chakraborty, Cecilia Carbonelli, Akash Kumar · 2026

The design of embedded safety-critical systems such as those used in next-generation automotive and autonomous platforms, is increasingly challenged by escalating system complexity, hardware-software …

Read Paper →

Computer Science Preprint PDF DOI

The Grand Software Supply Chain of AI Systems

Carmine Cesarano, Martin Monperrus · 2026

AI systems rest on software with low integrity mechanisms, leaving AI systems exposed across every stage from data acquisition to final inference. This paper makes the AI supply chain a first-class ob…

Read Paper →

Computer Science Preprint PDF DOI

Temporal Routing in Static Networks: The Schedule Completion Problem

Michelle Doring, Niklas Mohrin, George Skretas · 2026

We introduce the TemporallyEdgeDisjointScheduleCompletion (TEDSC) problem in which we need to cover a set of temporal edge demands $D$ by routing $k$ temporal walks through a directed static graph whi…

Read Paper →

Mathematics Preprint PDF DOI

Jump It\^o-type formula with arbitrary regularity

Nannan Li, Xing Gao · 2026

We establish an It\^o-type formula for finite $p$-variation paths with jumps for arbitrary $p\geq 1$. The formula is stated in a fully pathwise form and separates the reduced rough integral from expli…

Read Paper →

AI & Data Science Preprint PDF DOI

Decoding Scientific Experimental Images: The SPUR Benchmark for Perception, Understanding, and Reasoning

Junpeng Ding, Zichen Tang, Haihong E, Mengyuan Ji, Yang Liu, Haolin Tian, Haiyang Sun, Pengqi Sun, Yang Xu, Yichen Liu, Haocheng Gao, Zijie Xi, Ruomeng Jiang, Peizhi Zhao, Rongjin Li, Yuanze Li, Jiacheng Liu, Zhongjun Yang, Jintong Chen, Siying Lin · 2026

We introduce SPUR, a comprehensive benchmark for scientific experimental image perception, understanding, and reasoning, comprising 4,264 question-answering (QA) pairs derived from 1,084 expert-curate…

Read Paper →

Biology & Life Sciences Preprint PDF DOI

Epidemic Extinction in a Continuous SIRS Model with Vaccination

Germano Hartmann Brill, Pablo Enrique Jurado Silvestrin, Sebastian Goncalves · 2026

Epidemics have shaped human history, often with devastating consequences, motivating the development of mathematical models to understand and control their dynamics. Among the many aspects of epidemic…

Read Paper →

AI & Data Science Preprint PDF DOI

AutoSurfer -- Teaching Web Agents through Comprehensive Surfing, Learning, and Modeling

Fazle Elahi Faisal, Qianhui Wu, Baolin Peng, Jianfeng Gao · 2026

Recent advances in multimodal large language models (LLMs) have revolutionized web agents that can automate complex tasks on websites. However, their accuracy remains limited by the scarcity of high-q…

Read Paper →

AI & Data Science Preprint PDF DOI

Instruction Complexity Induces Positional Collapse in Adversarial LLM Evaluation

Jon-Paul Cacioli · 2026

When instructed to underperform on multiple-choice evaluations, do language models engage with question content or fall back on positional shortcuts? We map the boundary between these regimes using a …

Read Paper →

Computer Science Preprint PDF DOI

Theory Under Construction: Orchestrating Language Models for Research Software Where the Specification Evolves

Halley Young, Nikolaj Bjorner · 2026

Large language models can now generate substantial code and draft research text, but research-software projects require more than either artifact alone. The mathematical thesis, executable system, ben…

Read Paper →

Computer Science Preprint PDF DOI

CI-Repair-Bench: A Repository-Aware Benchmark for Automated Patch Validation via CI Workflows

Rabeya Khatun Muna, Md Nakhla Rafi, Tse-Hsun (Peter) Chen · 2026

Continuous Integration (CI) enforces repository-level correctness through multi-stage workflows and is central to modern software development, yet diagnosing and repairing CI failures remains challeng…

Read Paper →

AI & Data Science Preprint PDF DOI

Select to Think: Unlocking SLM Potential with Local Sufficiency

Wenxuan Ye, Yangyang Zhang, Xueli An, Georg Carle, Yunpu Ma · 2026

Small language models (SLMs) offer computational efficiency for scalable deployment, yet they often fall short of the reasoning power exhibited by their larger counterparts (LLMs). To mitigate this ga…

Read Paper →

Engineering Preprint PDF DOI

Walk With Me: Long-Horizon Social Navigation for Human-Centric Outdoor Assistance

Lingfeng Zhang, Xiaoshuai Hao, Xizhou Bu, Yingbo Tang, Hongsheng Li, Jinghui Lu, Xiu-shen Wei, Jiayi Ma, Yu Liu, Jing Zhang, Hangjun Ye, Xiaojun Liang, Long Chen, Wenbo Ding · 2026

Assisting humans in open-world outdoor environments requires robots to translate high-level natural-language intentions into safe, long-horizon, and socially compliant navigation behavior. Existing ma…

Read Paper →

Computer Science Preprint PDF DOI

AgentSim: A Platform for Verifiable Agent-Trace Simulation

Saber Zerhoudi, Michael Granitzer, Jelena Mitrovic · 2026

Training trustworthy agentic LLMs requires data that shows the grounded reasoning process, not just the final answer. Existing datasets fall short: question-answering data is outcome-only, chain-of-th…

Read Paper →

Economics & Finance Preprint PDF DOI

Dynamic Cheap Talk without Feedback

Atulya Jain · 2026

We study a dynamic sender-receiver game in which the sender observes a state evolving according to a Markov chain but does not observe the receiver's action. Despite the absence of feedback, dynamic i…

Read Paper →

Mathematics Preprint PDF DOI

On a conjecture of distance spectral extremal problems

Hongzhang Chen, Jianxi Li, Yongtao Li · 2026

Brualdi and Hoffman proposed a well-known problem of determining the graph with maximum adjacency spectral radius among all graphs with given size $m$. Early work by Friedland and Stanley addressed so…

Read Paper →

Physics Preprint PDF DOI

Study on the systematic effects on $b \to c$ inclusive semileptonic decays

Alessandro Barone, Ahmed Elgaziari, Shoji Hashimoto, Zhi Hu, Andreas Juttner, Takashi Kaneko, Ryan Kellermann · 2026

We discuss the calculation of the inclusive semileptonic decay for the process $B_s \to X_c \, l \nu_l$ using lattice QCD. This calculation could be decisive in understanding the long-standing tension…

Read Paper →

Physics Preprint PDF DOI

From short-lived to long-lived clouds: impact of star formation models on giant molecular cloud evolution in simulations of an NGC 300-like galaxy

Daniel Han, Taysun Kimm, Cheonsu Kang, Jaehyun Lee, Harley Katz, Joki Rosdahl · 2026

Multi-wavelength observations of molecular and ionized gas indicate that GMCs are short-lived, generally dispersing within one or two dynamical timescales. To investigate the physical origin of these …

Read Paper →

Mathematics Preprint PDF DOI

Asymptotic height of Plancherel random trees

Shengjun Zhang · 2026

We study a natural analogue of Ulam's problem for random rooted trees distributed according to a Plancherel-type measure. This probability measure is closely related to the classical Plancherel measur…

Read Paper →

AI & Data Science Preprint PDF DOI

SIEVES: Selective Prediction Generalizes through Visual Evidence Scoring

Hector G. Rodriguez, Marcus Rohrbach · 2026

Multimodal large language models (MLLMs) achieve ever-stronger performance on visual-language tasks. Even as traditional visual question answering benchmarks approach saturation, reliable deployment r…

Read Paper →

Browse Research Papers

Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes

Focus Session: Autonomous Systems Dependability in the era of AI: Design Challenges in Safety, Security, Reliability and Certification

The Grand Software Supply Chain of AI Systems

Temporal Routing in Static Networks: The Schedule Completion Problem

Jump It\^o-type formula with arbitrary regularity

Decoding Scientific Experimental Images: The SPUR Benchmark for Perception, Understanding, and Reasoning

Epidemic Extinction in a Continuous SIRS Model with Vaccination

AutoSurfer -- Teaching Web Agents through Comprehensive Surfing, Learning, and Modeling

Instruction Complexity Induces Positional Collapse in Adversarial LLM Evaluation

Theory Under Construction: Orchestrating Language Models for Research Software Where the Specification Evolves

CI-Repair-Bench: A Repository-Aware Benchmark for Automated Patch Validation via CI Workflows

Select to Think: Unlocking SLM Potential with Local Sufficiency

Walk With Me: Long-Horizon Social Navigation for Human-Centric Outdoor Assistance

AgentSim: A Platform for Verifiable Agent-Trace Simulation

Dynamic Cheap Talk without Feedback

On a conjecture of distance spectral extremal problems

Study on the systematic effects on $b \to c$ inclusive semileptonic decays

From short-lived to long-lived clouds: impact of star formation models on giant molecular cloud evolution in simulations of an NGC 300-like galaxy

Asymptotic height of Plancherel random trees

SIEVES: Selective Prediction Generalizes through Visual Evidence Scoring

Browse by Category

Research Type

Publish Your Research