Florian Huc — Research Repository

AI & Data Science Preprint PDF DOI

On the Proper Treatment of Units in Surprisal Theory

Samuel Kiegeland, Vesteinn Sn{ae}bjarnarson, Tim Vieira, Ryan Cotterell · 2026

Surprisal theory links human processing effort to the predictability of an upcoming linguistic unit, but empirical work often leaves the notion of a unit underspecified. In practice, experimental stim…

Read Paper →

Computer Science Preprint PDF DOI

Index-Assisted Stratified Sampling for Online Aggregation

Yunnan Yu, Zhuoyue Zhao · 2026

Ad-hoc queries over frequently updated data in a flat schema are common in real-time data analysis applications and often require very low latency. Online aggregation can achieve so by providing appro…

Read Paper →

AI & Data Science Preprint PDF DOI

Do Sparse Autoencoders Capture Concept Manifolds?

Usha Bhalla, Thomas Fel, Can Rager, Sheridan Feucht, Tal Haklay, Daniel Wurgaft, Siddharth Boppana, Matthew Kowal, Vasudev Shyam, Jack Merullo, Atticus Geiger, Ekdeep Singh Lubana · 2026

Sparse autoencoders (SAEs) are widely used to extract interpretable features from neural network representations, often under the implicit assumption that concepts correspond to independent linear dir…

Read Paper →

Computer Science Preprint PDF DOI

Akita: A High Usability Simulation Framework for Computer Architecture

Sabila Al Jannat, Ying Li, Mengyang He, Xuzhong Wang, Huizhi Zhao, Jingxiang Sun, Daoxuan Xu, Enze Xu, Yifan Sun · 2026

Computer architecture simulation is essential for evaluating new designs without the need for costly tapeout. The community has developed dozens of valuable simulators that have enabled significant ar…

Read Paper →

AI & Data Science Preprint PDF DOI

PROMISE-AD: Progression-aware Multi-horizon Survival Estimation for Alzheimer's Disease Progression and Dynamic Tracking

Qing Lyu, Jeremy Hudson, Mohammad Kawas, Yuming Jiang, Chenyu You, Christopher T Whitlow · 2026

Individualized Alzheimer's disease (AD) progression prediction requires models that use irregular visits, account for censoring, avoid diagnostic leakage, and provide calibrated horizon risks. We prop…

Read Paper →

AI & Data Science Preprint PDF DOI

Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents

Rahul Ramachandran, Nidhi Jha, Muthukumaran Ramasubramanian · 2026

We present Collaborative Agent Reasoning Engineering (CARE), a disciplined methodology for engineering Large Language Model (LLM) agents in scientific domains. Unlike ad-hoc trial-and-error approaches…

Read Paper →

Computer Science Preprint PDF DOI

Energy-Aware Quantum-Enhanced Computing Continuum

Carlos J. Barrios H., Frederic Le Mouel, Oscar Carrillo · 2026

We discuss a Quantum-Enhanced Computing Continuum, a heterogeneous, hybrid architecture that integrates quantum processing units (QPUs) within an Edge-Cloud-HPC fabric. Promote sustainability by shift…

Read Paper →

Sociology & Anthropology Preprint PDF DOI

Scale-freeness under node removal: a finite-size scaling perspective

Yeonsu Jeong, Deok-Sun Lee, Mi Jin Lee, Seung-Woo Son · 2026

In heterogeneous network systems such as ecological and social networks, structural stability depends on how connectivity changes under node removal, as different removal sequences can trigger distinc…

Read Paper →

AI & Data Science Preprint PDF DOI

Geometry-Calibrated Conformal Abstention for Language Models

Rui Xu, Yi Chen, Sihong Xie, Hui Xiong · 2026

When language models lack relevant knowledge for a given query, they frequently generate plausible responses that can be hallucinations, rather than admitting being agnostic about the answer. Retraini…

Read Paper →

AI & Data Science Preprint PDF DOI

Post-Optimization Adaptive Rank Allocation for LoRA

Vishnuprasadh Kumaravelu, Sunil Gupta, P. K. Srijith · 2026

Exponential growth in the scale of modern foundation models has led to the widespread adoption of Low-Rank Adaptation (LoRA) as a parameter-efficient fine-tuning technique. However, standard LoRA impl…

Read Paper →

Computer Science Preprint PDF DOI

RuC: HDL-Agnostic Rule Completion Benchmark Generation

Arnau Ayguade Domingo, Miquel Alberti-Binimelis, Cristian Gutierrez-Gomez, Emanuele Parisi, Razine Moundir Ghorab, Miquel Moreto, Gokcen Kestor, Dario Garcia-Gasulla · 2026

Large Language Models (LLMs) have rapidly improved in performance across code-related tasks, making their integration into Register Transfer Level (RTL) development increasingly attractive. Mimicking …

Read Paper →

Computer Science Preprint PDF DOI

LLM-as-a-Judge for Human-AI Co-Creation: A Reliability-Aware Evaluation Framework for Coding

Md Faizul Ibne Amin, Yutaka Watanobe, Daniel M. Muepu, Haruto Suzuki, Kenta Nanaumi, Md Mostafizer Rahman · 2026

LLMs are increasingly employed both as judges for evaluating open-ended outputs and as co-creation partners in AI-assisted programming; yet rigorous evaluation in human-AI co-creation settings remains…

Read Paper →

Computer Science Preprint PDF DOI

PuzzleMark: Implicit Jigsaw Learning for Robust Code Dataset Watermarking in Neural Code Completion Models

Haocheng Huang, Yuchen Chen, Weisong Sun, Peizhuo Lv, Yuan Xiao, Chunrong Fang, Yang Liu, Xiaofang Zhang · 2026

Constructing and curating high-quality code datasets requires significant resources, making them valuable intellectual property. Unfortunately, these datasets currently face severe risks of unauthoriz…

Read Paper →

AI & Data Science Preprint PDF DOI

One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness

Hiroyuki Deguchi, Katsuki Chousa, Yusuke Sakai · 2026

The hubness problem, in which hub embeddings are close to many unrelated examples, occurs often in high-dimensional embedding spaces and may pose a practical threat for purposes such as information re…

Read Paper →

Physics Preprint PDF DOI

Blazar flares from plasma blobs crossing the broad-line region

Sebastien Le Bihan, Anton Dmytriiev, Andreas Zech · 2026

The blazar 3C 279 is well known for its rapid and large-amplitude variability. On 20 December 2013, the source exhibited an orphan {\gamma}-ray flare characterized by a flux-doubling timescale of a fe…

Read Paper →

AI & Data Science Preprint PDF DOI

SECOS: Semantic Capture for Rigorous Classification in Open-World Semi-Supervised Learning

Hezhao Liu, Jiacheng Yang, Junlong Gao, Mengke Li, Yiqun Zhang, Shreyank N Gowda, Yang Lu · 2026

In open-world semi-supervised learning (OWSSL), a model learns from labeled data and unlabeled data containing both known and novel classes. In practical OWSSL applications, models are expected to per…

Read Paper →

Computer Science Preprint PDF DOI

Towards the Democratization and Standardization of Dynamic Resources with MPI Spawning

Sergio Iserte, Iker Martin-Alvarez, Krzystof Rojek, Jose I. Aliaga, Maribel Castillo, Antonio J. Pena · 2026

This paper presents an efficient tool for managing dynamic resources in production high-performance computing (HPC) settings, focusing on flexibility, adaptability, and user-friendliness. We introduce…

Read Paper →

Physics Preprint PDF DOI

Kolmogorov-Sinai entropies identify optimal observables for prediction and dynamics reconstruction in chaotic systems

Maximilian Topel · 2026

Choosing the optimal observable to model dynamical systems for which we do not know the driving equations is nearly always an ad hoc art. Takens' Delay Embedding Theorem guarantees a diffeomorphism be…

Read Paper →

AI & Data Science Preprint PDF DOI

Proactive Dialogue Model with Intent Prediction

Yang Luo · 2026

Dialogue models are inherently reactive, responding to the current user turn without anticipating upcoming intents, which leads to redundant interactions in multi-intent settings. We address this limi…

Read Paper →

AI & Data Science Preprint PDF DOI

LLMs Capture Emotion Labels, Not Emotion Uncertainty: Distributional Analysis and Calibration of Human--LLM Judgment Gaps

Keito Inoshita, Xiaokang Zhou, Akira Kawai, Katsutoshi Yada · 2026

Human annotators frequently disagree on emotion labels, yet most evaluations of Large Language Model (LLM) emotion annotation collapse these judgments into a single gold standard, discarding the distr…

Read Paper →

Browse Research Papers

On the Proper Treatment of Units in Surprisal Theory

Index-Assisted Stratified Sampling for Online Aggregation

Do Sparse Autoencoders Capture Concept Manifolds?

Akita: A High Usability Simulation Framework for Computer Architecture

PROMISE-AD: Progression-aware Multi-horizon Survival Estimation for Alzheimer's Disease Progression and Dynamic Tracking

Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents

Energy-Aware Quantum-Enhanced Computing Continuum

Scale-freeness under node removal: a finite-size scaling perspective

Geometry-Calibrated Conformal Abstention for Language Models

Post-Optimization Adaptive Rank Allocation for LoRA

RuC: HDL-Agnostic Rule Completion Benchmark Generation

LLM-as-a-Judge for Human-AI Co-Creation: A Reliability-Aware Evaluation Framework for Coding

PuzzleMark: Implicit Jigsaw Learning for Robust Code Dataset Watermarking in Neural Code Completion Models

One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness

Blazar flares from plasma blobs crossing the broad-line region

SECOS: Semantic Capture for Rigorous Classification in Open-World Semi-Supervised Learning

Towards the Democratization and Standardization of Dynamic Resources with MPI Spawning

Kolmogorov-Sinai entropies identify optimal observables for prediction and dynamics reconstruction in chaotic systems

Proactive Dialogue Model with Intent Prediction

LLMs Capture Emotion Labels, Not Emotion Uncertainty: Distributional Analysis and Calibration of Human--LLM Judgment Gaps

Browse by Category

Research Type

Publish Your Research