Jeffery Weir — Research Repository

AI & Data Science Preprint PDF DOI

HATS: An Open data set Integrating Human Perception Applied to the Evaluation of Automatic Speech Recognition Metrics

Thibault Baneras Roux, Jane Wottawa, Mickael Rouvier, Teva Merlin, Richard Dufour · 2026

Conventionally, Automatic Speech Recognition (ASR) systems are evaluated on their ability to correctly recognize each word contained in a speech signal. In this context, the word error rate (WER) metr…

Read Paper →

AI & Data Science Preprint PDF DOI

Qualitative Evaluation of Language Model Rescoring in Automatic Speech Recognition

Thibault Baneras-Roux, Mickael Rouvier, Jane Wottawa, Richard Dufour · 2026

Evaluating automatic speech recognition (ASR) systems is a classical but difficult and still open problem, which often boils down to focusing only on the word error rate (WER). However, this metric su…

Read Paper →

Mathematics Preprint PDF DOI

Quantitative homogenization of the maximal action of curves in a Brownian potential

Felix Otto, Matteo Palmieri · 2026

Motivated by an optimal-matching problem (Leighton-Shor) and the random-field Ising model (Aizenman-Wehr, Ding-Wirth), we consider a variational problem for graphs in $1+1$ dimension maximizing an act…

Read Paper →

Mathematics Preprint PDF DOI

An improved non-linear Roth-type theorem in finite fields

Mark Lewko · 2026

Let $F$ be a finite field of odd characteristic. We prove that any set $A\subset F$ with $|A|\geq C|F|^{5/6}$ contains a nontrivial quadratic progression $(x, x+y, x+y^2), y\neq 0.$ For prime fields, …

Read Paper →

Engineering Preprint PDF DOI

BUT System Description for CHiME-9 MCoRec Challenge

Dominik Klement, Alexander Polok, Nguyen Hai Phong, Prachi Singh, Lukas Burget · 2026

Multi-talker automatic speech recognition (ASR) in conversational recordings remains an open problem, particularly in scenarios with large portion of overlapping speech where identifying and transcrib…

Read Paper →

Computer Science Preprint PDF DOI

A Reproducibility Study of LLM-Based Query Reformulation

Amin Bigdeli, Radin Hamidi Rad, Hai Son Le, Mert Incesu, Negar Arabzadeh, Charles L. A. Clarke, Ebrahim Bagheri · 2026

Large Language Models (LLMs) are now widely used for query reformulation and expansion in Information Retrieval, with many studies reporting substantial effectiveness gains. However, these results are…

Read Paper →

Computer Science Preprint PDF DOI

Few-Shot Accent Synthesis for ASR with LLM-Guided Phoneme Editing

Yurii Halychanskyi, Nimet Beyza Bozdag, Mark Hasegawa-Johnson, Dilek Hakkani-Tur, Volodymyr Kindratenko · 2026

Accented automatic speech recognition (ASR) often degrades due to the limited availability of accented training data. Prior work has explored accent modeling in low-resource settings, but existing app…

Read Paper →

Engineering Preprint PDF DOI

SPG-Codec: Exploring the Role and Boundaries of Semantic Priors in Ultra-Low-Bitrate Neural Speech Coding

Mingyu Zhao, Zijian Lin, Kun Wei, Zhiyong Wu · 2026

Conventional neural speech codecs suffer from severe intelligibility degradation at ultra-low bitrates, where the bottleneck transitions from acoustic distortion to semantic loss. To address this issu…

Read Paper →

Engineering Preprint PDF DOI

One Voice, Many Tongues: Cross-Lingual Voice Cloning for Scientific Speech

Amanuel Gizachew Abebe, Yasmin Moslem · 2026

Preserving a speaker's voice identity while generating speech in a different language remains a fundamental challenge in spoken language technology, particularly in specialized domains such as scienti…

Read Paper →

Mathematics Preprint PDF DOI

The coordinate ring of the universal centralizer via Demazure operators

Tom Gannon, Victor Ginzburg · 2026

We give a simple description of the coordinate ring of the universal centralizer associated to a simply connected semisimple group. To this end, we prove a general result on Weil restriction of affine…

Read Paper →

AI & Data Science Preprint PDF DOI

WhisperPipe: A Resource-Efficient Streaming Architecture for Real-Time Automatic Speech Recognition

Erfan Ramezani, Mohammad Mahdi Giahi, Mohammad Erfan Zarabadipour, Amir Reza Yosefian, Hamid Ghadiri · 2026

Real-time automatic speech recognition (ASR) systems face a fundamental trade-off between transcription accuracy and computational efficiency, particularly when deploying large-scale transformer model…

Read Paper →

Computer Science Preprint PDF DOI

PSP: An Interpretable Per-Dimension Accent Benchmark for Indic Text-to-Speech

Venkata Pushpak Teja Menta · 2026

Standard text-to-speech (TTS) evaluation measures intelligibility (WER, CER) and overall naturalness (MOS, UTMOS) but does not quantify accent. A synthesiser may score well on all four yet sound non-n…

Read Paper →

Computer Science Preprint PDF DOI

Praxy Voice: Voice-Prompt Recovery + BUPS for Commercial-Class Indic TTS from a Frozen Non-Indic Base at Zero Commercial-Training-Data Cost

Venkata Pushpak Teja Menta · 2026

Commercial TTS systems produce near-native Indic audio, but the best open-source bases (Chatterbox, Indic Parler-TTS, IndicF5) trail them on measured phonological dimensions, and the most widely adopt…

Read Paper →

Computer Science Preprint PDF DOI

Author response to commentaries on H is for Human and How (Not) to Evaluate Qualitative Research in HCI

Andy Crabtree · 2026

This is the authors response to commentaries on the original article H is for Human and How (Not) to Evaluate Qualitative Research in HCI, https://doi.org/10.1080/07370024.2025.2475743 Commentaries we…

Read Paper →

Mathematics Preprint PDF DOI

Reciprocity and the Maslov Phase

Jonathan Holland · 2026

We give a metaplectic proof of Hilbert reciprocity, and hence of quadratic reciprocity, in which the local phase is the Kashiwara--Maslov phase of a triple of Lagrangians. In rank two the phase of the…

Read Paper →

AI & Data Science Preprint PDF DOI

Benchmarking OCR Pipelines with Adaptive Enhancement for Multi-Domain Retail Bill Digitization

Vijaysinh Gaikwad · 2026

The digitization of multi-domain retail billing documents remains a challenging task due to variability in scan quality, layout heterogeneity, and domain diversity across commercial sectors. This pape…

Read Paper →

Computer Science Preprint PDF DOI

UnIte: Uncertainty-based Iterative Document Sampling for Domain Adaptation in Information Retrieval

Jongyoon Kim, Minseong Hwang, Seung-won Hwang · 2026

Unsupervised domain adaptation generalizes neural retrievers to an unseen domain by generating pseudo queries on target domain documents. The quality and efficiency of this adaptation critically depen…

Read Paper →

Computer Science Preprint PDF DOI

The Blahut--Arimoto Algorithm as a Dynamical System with Exact $\chi^2$ Dissipation

Qiao Wang · 2026

This paper uncovers an exact $\chi^2$ dissipation identity for the Blahut--Arimoto (BA) flow and establishes its fundamental information-geometric structure. While prior works have analyzed BA converg…

Read Paper →

Mathematics Preprint PDF DOI

Integral representation of polynomial local functionals on convex functions

Jonas Knoerr · 2026

Integral representations for continuous polynomial local functionals on convex functions are established in terms of a finite family of polynomials. This result is obtained by approximation from a cla…

Read Paper →

Computer Science Preprint PDF DOI

Prism-Reranker: Beyond Relevance Scoring -- Jointly Producing Contributions and Evidence for Agentic Retrieval

Dun Zhang · 2026

Modern retrieval pipelines increasingly serve downstream consumers like retrieval-augmented generation (RAG) and autonomous agents that need more than a scalar relevance score. A reranker that only te…

Read Paper →

Browse Research Papers

HATS: An Open data set Integrating Human Perception Applied to the Evaluation of Automatic Speech Recognition Metrics

Qualitative Evaluation of Language Model Rescoring in Automatic Speech Recognition

Quantitative homogenization of the maximal action of curves in a Brownian potential

An improved non-linear Roth-type theorem in finite fields

BUT System Description for CHiME-9 MCoRec Challenge

A Reproducibility Study of LLM-Based Query Reformulation

Few-Shot Accent Synthesis for ASR with LLM-Guided Phoneme Editing

SPG-Codec: Exploring the Role and Boundaries of Semantic Priors in Ultra-Low-Bitrate Neural Speech Coding

One Voice, Many Tongues: Cross-Lingual Voice Cloning for Scientific Speech

The coordinate ring of the universal centralizer via Demazure operators

WhisperPipe: A Resource-Efficient Streaming Architecture for Real-Time Automatic Speech Recognition

PSP: An Interpretable Per-Dimension Accent Benchmark for Indic Text-to-Speech

Praxy Voice: Voice-Prompt Recovery + BUPS for Commercial-Class Indic TTS from a Frozen Non-Indic Base at Zero Commercial-Training-Data Cost

Author response to commentaries on H is for Human and How (Not) to Evaluate Qualitative Research in HCI

Reciprocity and the Maslov Phase

Benchmarking OCR Pipelines with Adaptive Enhancement for Multi-Domain Retail Bill Digitization

UnIte: Uncertainty-based Iterative Document Sampling for Domain Adaptation in Information Retrieval

The Blahut--Arimoto Algorithm as a Dynamical System with Exact $\chi^2$ Dissipation

Integral representation of polynomial local functionals on convex functions

Prism-Reranker: Beyond Relevance Scoring -- Jointly Producing Contributions and Evidence for Agentic Retrieval

Browse by Category

Research Type

Publish Your Research