Expertini Research Research

Browse Research Papers

7,521+ open-access research outputs.

✕ Clear
🔍 muriel medard
Showing 7521 results for "muriel medard"
AI & Data Science Preprint PDF DOI

RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses

Feiyu Wu, Xu Zheng, Zhuocheng Wang, Yi ming Dai, Hui Li · 2026

Large language models (LLMs) make reward design in reinforcement learning substantially more scalable, but generated rewards are not automatically reliable training objectives. Existing work has focus…

Read Paper →
AI & Data Science Preprint PDF DOI

Calibrating Attribution Proxies for Reward Allocation in Participatory Weather Sensing

Mark C. Ballandies, Michael T. C. Chiu, Claudio J. Tessone · 2026

Large-scale IoT weather sensing networks require incentive mechanisms to sustain participation, yet determining how much value individual data contributions bring to the network remains an open proble…

Read Paper →
AI & Data Science Preprint PDF DOI

Iterative Multimodal Retrieval-Augmented Generation for Medical Question Answering

Xupeng Chen, Binbin Shi, Chenqian Le, Jiaqi Zhang, Kewen Wang, Ran Gong, Jinhan Zhang, Chihang Wang · 2026

Medical retrieval-augmented generation (RAG) systems typically operate on text chunks extracted from biomedical literature, discarding the rich visual content (tables, figures, structured layouts) of …

Read Paper →
AI & Data Science Preprint PDF DOI

Debiasing Reward Models via Causally Motivated Inference-Time Intervention

Kazutoshi Shinoda, Kosuke Nishida, Kyosuke Nishida · 2026

Reward models (RMs) play a central role in aligning large language models (LLMs) with human preferences. However, RMs are often sensitive to spurious features such as response length. Existing inferen…

Read Paper →
AI & Data Science Preprint PDF DOI

From Coarse to Fine: Benchmarking and Reward Modeling for Writing-Centric Generation Tasks

Qingyu Ren, Tianjun Pan, Xingzhou Chen, Xuhong Wang · 2026

Large language models have achieved remarkable progress in text generation but still struggle with generative writing tasks. In terms of evaluation, existing benchmarks evaluate writing reward models …

Read Paper →
Engineering Preprint PDF DOI

A Knowledge-Driven Approach to Target Speech Extraction in the Presence of Background Sound Effects for Cinematic Audio Source Separation (CASS)

Chun-wei Ho, Sabato Marco Siniscalchi, Kai Li, Chin-Hui Lee · 2026

We propose a knowledge-driven approach to speech target extraction in the presence of background sound effects already recorded in cinematic audio. The specific knowledge sources studied are manners o…

Read Paper →
AI & Data Science Preprint PDF DOI

How to Guide Your Flow: Few-Step Alignment via Flow Map Reward Guidance

Jerry Y. Huang, Justin Lin, Sheel Shah, Kartik Nair, Nicholas M. Boffi · 2026

In generative modeling, we often wish to produce samples that maximize a user-specified reward such as aesthetic quality or alignment with human preferences, a problem known as guidance. Despite their…

Read Paper →
AI & Data Science Preprint PDF DOI

AdvDMD: Adversarial Reward Meets DMD For High-Quality Few-Step Generation

Xu Wang, Zexian Li, Litong Gong, Tiezheng Ge, Zhijie Deng · 2026

Diffusion models offer superior generation quality at the expense of extensive sampling steps. Distillation methods, with Distribution Matching Distillation (DMD) as a popular example, can mitigate …

Read Paper →
AI & Data Science Preprint PDF DOI

Population Dynamics in ARIEL Robotics Systems Featuring Embodied Evolution via Spatial Mating Mechanisms

Victoria Peterson, Akshat Srivastava, Raghav Prabhakar · 2026

We present a Spatially Embedded Evolutionary Algorithm where robot individuals exist in a physically simulated 2D environment, must navigate to encounter potential mates, and compete for survival unde…

Read Paper →
Physics Preprint PDF DOI

Normalizing flows for density estimation in multi-detector gravitational-wave searches

Sam Insley, Michael J. Williams, Rahul Dhurkunde, Ian Harry · 2026

Identifying compact binary coalescences buried within the non-Gaussian and non-stationary data taken by gravitational-wave interferometers requires sophisticated search pipelines, such as the PyCBC an…

Read Paper →
AI & Data Science Preprint PDF DOI

Uncertainty-Aware Reward Discounting for Mitigating Reward Hacking

Disha Singha · 2026

Reinforcement learning (RL) systems typically optimize scalar reward functions that assume precise and reliable evaluation of outcomes. However, real-world objectives--especially those derived from hu…

Read Paper →
Computer Science Preprint PDF DOI

Towards Low-Cost Low-Power Activity-Aware Soil Moisture Sensing Platform for Large-scale Farming

Jack Thoene, Omar Kamil, Thekra Alkadee, Nivedita Arora · 2026

Deep understanding of a field's soil moisture content is the leading indicator for predicting crop yields and making data driven decisions for irrigation and application of topical chemicals for droug…

Read Paper →
Computer Science Preprint PDF DOI

Institutional Floors and Partisan Lenses: Cross-National Online Discourse on Political Violence in France and the United States

Andrew Yen Chang · 2026

This paper studies how online discussion shapes and assesses political violence across different settings, particularly how moral evaluation, as a social perception, varies across institutional contex…

Read Paper →
AI & Data Science Preprint PDF DOI

reward-lens: A Mechanistic Interpretability Library for Reward Models

Mohammed Suhail B Nadaf · 2026

Every RLHF-trained language model is shaped by a reward model, yet the mechanistic interpretability toolkit -- logit lens, direct logit attribution, activation patching, sparse autoencoders -- was bui…

Read Paper →
Computer Science Preprint PDF DOI

Embedded Rust or C Firmware? Lessons from an Industrial Microcontroller Use Case with Ariel OS

Bipin Thapa, Daniele Alfonso, Lorenzo Bini, Licio Mapelli, Kaspar Schleiser, Romain Fouquet, Emmanuel Baccelli · 2026

As Rust gains traction for developing safer systems software, a reality check for the microcontroller hardware segment becomes necessary. How ready is the Rust ecosystem for this segment? Can Rust com…

Read Paper →
Computer Science Preprint PDF DOI

Medoid Prototype Alignment for Cross-Plant Unknown Attack Detection in Industrial Control Systems

Luyao Wang · 2026

Deploying an intrusion detector trained in one industrial plant to another remains difficult because Industrial Control System (ICS) traffic is highly site-dependent, labels are scarce, and unseen att…

Read Paper →
Computer Science Preprint PDF DOI

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

Hojae Han, Yeonseok Jeong, Seung-won Hwang, Zhewei Yao, Yuxiong He · 2026

Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equiv…

Read Paper →
AI & Data Science Preprint PDF DOI

Zero Shot Coordination for Sparse Reward Tasks with Diverse Reward Shapings

Keenan Powell, Peihong Yu, Pratap Tokekar · 2026

Many Multi-Agent Reinforcement Learning (MARL) agents fail to adapt properly to cooperating with agents trained with the same objectives but different seeds, algorithms, or other training differences.…

Read Paper →
AI & Data Science Preprint PDF DOI

Improving Vision-language Models with Perception-centric Process Reward Models

Yingqian Min, Kun Zhou, Yifan Li, Yuhuan Wu, Han Peng, Yifan Du, Wayne Xin Zhao, Min Yang, Ji-Rong Wen · 2026

Recent advancements in reinforcement learning with verifiable rewards (RLVR) have significantly improved the complex reasoning ability of vision-language models (VLMs). However, its outcome-level supe…

Read Paper →
AI & Data Science Preprint PDF DOI

A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning

Ying-Tu Chen, Wei Hung, Bing-Shu Wu, Zhang-Wei Hong, Ping-Chun Hsieh · 2026

Many sequential decision-making tasks involve optimizing multiple conflicting objectives, requiring policies that adapt to different user preferences. In multi-objective reinforcement learning (MORL),…

Read Paper →
Page 1 of 377 Next →