Yury Polyanskiy — Research Repository

AI & Data Science Preprint PDF DOI

Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

Jiahang Lin, Shichun Liu, Chengjun Pan, Lizhi Lin, Shihan Dou, Xuanjing Huang, Hang Yan, Zhenhua Han, Tao Gui · 2026

Harnesses are now central to coding-agent performance, mediating how models interact with tools and execution environments. Yet harness engineering remains a manual craft, because automating it faces …

Read Paper →

AI & Data Science Preprint PDF DOI

JURY-RL: Votes Propose, Proofs Dispose for Label-Free RLVR

Xinjie Chen, Biao Fu, Jing Wu, Guoxin Chen, Xinggao Liu, Dayiheng Liu, Minpeng Liao · 2026

Reinforcement learning with verifiable rewards (RLVR) enhances the reasoning of large language models (LLMs), but standard RLVR often depends on human-annotated answers or carefully curated reward spe…

Read Paper →

AI & Data Science Preprint PDF DOI

Evaluating Multimodal LLMs for Inpatient Diagnosis: Real-World Performance, Safety, and Cost Across Ten Frontier Models

Bruce A. Bassett, Amy Rouillard, Sitwala Mundia, Michael Cameron Gramanie, Linda Camara, Ziyaad Dangor, Shabir A. Madhi, Kajal Morar, Marlvin T. Ncube, Ismail Kalla, Haroon Saloojee · 2026

Background: Large language models (LLMs) are increasingly proposed for diagnostic support, but few evaluations use real-world multimodal inpatient data, particularly in low and middle-income country (…

Read Paper →

AI & Data Science Preprint PDF DOI

Can LLMs Score Medical Diagnoses and Clinical Reasoning as well as Expert Panels?

Amy Rouillard, Sitwala Mundia, Linda Camara, Michael Cameron Gramanie, Ziyaad Dangor, Ismail Kalla, Shabir A. Madhi, Kajal Morar, Marlvin T. Ncube, Haroon Saloojee, Bruce A. Bassett · 2026

Evaluating medical AI systems using expert clinician panels is costly and slow, motivating the use of large language models (LLMs) as alternative adjudicators. Here, we evaluate an LLM jury composed o…

Read Paper →

AI & Data Science Preprint PDF DOI

TRUST Agents: A Collaborative Multi-Agent Framework for Fake News Detection, Explainable Verification, and Logic-Aware Claim Reasoning

Gautama Shastry Bulusu Venkata, Santhosh Kakarla, Maheedhar Omtri Mohan, Aishwarya Gaddam · 2026

TRUST Agents is a collaborative multi-agent framework for explainable fact verification and fake news detection. Rather than treating verification as a simple true-or-false classification task, the sy…

Read Paper →

AI & Data Science Preprint PDF DOI

Agentic Video Generation: From Text to Executable Event Graphs via Tool-Constrained LLM Planning

Nicolae Cudlenco, Mihai Masala, Marius Leordeanu · 2026

Existing multi-agent video generation systems use LLM agents to orchestrate neural video generators, producing visually impressive but semantically unreliable outputs with no ground truth annotations.…

Read Paper →

Computer Science Preprint PDF DOI

Diffusion Denoiser Achievable Analysis for Finite Blocklength Unsourced Random Access

Yuming Han, Yuxin Long · 2026

Polyanskiy proposed a framework for the unsourced multiple access channel (MAC) problem where users employ a common codebook in the finite blocklength regime. However, existing approaches handle chann…

Read Paper →

AI & Data Science Preprint PDF DOI

COMPOSITE-Stem

Kyle Waters, Lucas Nuzzi, Tadhg Looram, Alessandro Tomasiello, Ariel Ghislain Kemogne Kamdoum, Bikun Li, Damien Sileo, Egor Kretov, Francesco Fournier-Facio, Georgios Soloupis, Haile Kassahun, Hew Wolff, Jiaqi Cai, Lianghui Li, Marc Roth, Mohinder Naiya, Naixu Guo, Qicheng Tang, Richard Wheeler, Samuele Sala, Serguei Popov, Steven Dillmann, Yuqi Li · 2026

AI agents hold growing promise for accelerating scientific discovery; yet, a lack of frontier evaluations hinders adoption into real workflows. Expert-written benchmarks have proven effective at measu…

Read Paper →

Computer Science Preprint PDF DOI

Doctoral Theses in France (1985-2025): A Linked Dataset of PhDs, Academic Networks, and Institutions

William Aboucaya, Dastan Jasim · 2026

This paper presents a comprehensive dataset of doctoral theses defended in France between 1985 and 2025, constructed from multiple national academic metadata sources. The dataset is primarily based on…

Read Paper →

AI & Data Science Preprint PDF DOI

An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks

Gabriel Stefan, Adrian-Marius Dumitran · 2026

History textbooks often contain implicit biases, nationalist framing, and selective omissions that are difficult to audit at scale. We propose an agentic evaluation architecture comprising a multimoda…

Read Paper →

Computer Science Preprint PDF DOI

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

Yinghan Hou, Zongyou Yang · 2026

OpenClaw's ClawHub marketplace hosts over 13,000 community-contributed agent skills, and between 13% and 26% of them contain security vulnerabilities according to recent audits. Regex scanners miss ob…

Read Paper →

Mathematics Preprint PDF DOI

A note on piercing discrete rectangles

Wei Rao · 2026

In 2008, Halman proved a discrete Helly-type theorem for axis-parallel boxes in $\mathbb R^d$. Very recently, this result was extended to the $(p,q)$ setting with $p \geq q \geq d+1$ by Edwards and So…

Read Paper →

Economics & Finance Preprint PDF DOI

Adversarial Selection

Alma Cohen, Alon Klement, Zvika Neeman, Eilon Solan · 2026

In many institutional settings, $k$ items are selected with the goal of representing the underlying distribution of claims, opinions, or characteristics in a large population. We study environments wi…

Read Paper →

AI & Data Science Preprint PDF DOI

From Pixels to Semantics: A Multi-Stage AI Framework for Structural Damage Detection in Satellite Imagery

Bijay Shakya, Catherine Hoier, Khandaker Mamun Ahmed · 2026

Rapid and accurate structural damage assessment following natural disasters is critical for effective emergency response and recovery. However, remote sensing imagery often suffers from low spatial re…

Read Paper →

AI & Data Science Preprint PDF DOI

Multiperspectivity as a Resource for Narrative Similarity Prediction

Max Upravitelev, Veronika Solopova, Jing Yang, Charlott Jakob, Premtim Sahitaj, Ariana Sahitaj, Vera Schmitt · 2026

Predicting narrative similarity can be understood as an inherently interpretive task: different, equally valid readings of the same text can produce divergent interpretations and thus different simila…

Read Paper →

AI & Data Science Preprint PDF DOI

Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis

May Lynn Reese, Markela Zeneli, Mindy Ng, Jacob Haimes, Andreea Damien, Elizabeth Stade · 2026

General-purpose Large Language Models (LLMs) are becoming widely adopted by people for mental health support. Yet emerging evidence suggests there are significant risks associated with high-frequency …

Read Paper →

Physics Preprint PDF DOI

Quantum Hamlets: Distributed Compilation of Large Algorithmic Graph States

Anthony Micciche, Naphan Benchasattabuse, Andrew McGregor, Michal Hajdusek, Rodney Van Meter, Stefan Krastanov · 2026

We investigate the problem of compiling the generation of graph states to arbitrarily many distributed homogeneous quantum processing units (QPUs), providing a scalable partitioning algorithm and grap…

Read Paper →

AI & Data Science Preprint PDF DOI

Scalable Evaluation of the Realism of Synthetic Environmental Augmentations in Images

Damian J. Ruck, Paul Vautravers, Oliver Chalkley, Jake Thomas · 2026

Evaluation of AI systems often requires synthetic test cases, particularly for rare or safety-critical conditions that are difficult to observe in operational data. Generative AI offers a promising ap…

Read Paper →

Physics Preprint PDF DOI

The Road to Useful Quantum Computers

Timothy Proctor, Robin Blume-Kohout, Andrew Baczewski · 2026

Building a useful quantum computer is a grand science and engineering challenge, currently pursued intensely by teams around the world. In the 1980s, Richard Feynman and Yuri Manin observed independen…

Read Paper →

AI & Data Science Preprint PDF DOI

Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents

Jonas Karge · 2026

We investigate the collective accuracy of heterogeneous agents who learn to estimate their own reliability over time and selectively abstain from voting. While classical epistemic voting results, such…

Read Paper →

Browse Research Papers

Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

JURY-RL: Votes Propose, Proofs Dispose for Label-Free RLVR

Evaluating Multimodal LLMs for Inpatient Diagnosis: Real-World Performance, Safety, and Cost Across Ten Frontier Models

Can LLMs Score Medical Diagnoses and Clinical Reasoning as well as Expert Panels?

TRUST Agents: A Collaborative Multi-Agent Framework for Fake News Detection, Explainable Verification, and Logic-Aware Claim Reasoning

Agentic Video Generation: From Text to Executable Event Graphs via Tool-Constrained LLM Planning

Diffusion Denoiser Achievable Analysis for Finite Blocklength Unsourced Random Access

COMPOSITE-Stem

Doctoral Theses in France (1985-2025): A Linked Dataset of PhDs, Academic Networks, and Institutions

An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

A note on piercing discrete rectangles

Adversarial Selection

From Pixels to Semantics: A Multi-Stage AI Framework for Structural Damage Detection in Satellite Imagery

Multiperspectivity as a Resource for Narrative Similarity Prediction

Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis

Quantum Hamlets: Distributed Compilation of Large Algorithmic Graph States

Scalable Evaluation of the Realism of Synthetic Environmental Augmentations in Images

The Road to Useful Quantum Computers

Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents

Browse by Category

Research Type

Publish Your Research