Anna Huber in Engineering — Research Repository

Engineering Preprint PDF DOI

BUT System Description for CHiME-9 MCoRec Challenge

Dominik Klement, Alexander Polok, Nguyen Hai Phong, Prachi Singh, Lukas Burget · 2026

Multi-talker automatic speech recognition (ASR) in conversational recordings remains an open problem, particularly in scenarios with large portion of overlapping speech where identifying and transcrib…

Read Paper →

Engineering Preprint PDF DOI

SPG-Codec: Exploring the Role and Boundaries of Semantic Priors in Ultra-Low-Bitrate Neural Speech Coding

Mingyu Zhao, Zijian Lin, Kun Wei, Zhiyong Wu · 2026

Conventional neural speech codecs suffer from severe intelligibility degradation at ultra-low bitrates, where the bottleneck transitions from acoustic distortion to semantic loss. To address this issu…

Read Paper →

Engineering Preprint PDF DOI

Distributed Snitch Digital Twin-Based Anomaly Detection for Smart Voltage Source Converter-Enabled Wind Power Systems

Mohammad Ashraf Hossain Sadi, Soham Ghosh, Siby Plathottam, Mohd. Hasan Ali · 2026

Existing cyberattack detection methods for smart grids such as Artificial Neural Networks (ANNs) and Deep Reinforcement Learning (DRL) often suffer from limited adaptability, delayed response, and ina…

Read Paper →

Engineering Preprint PDF DOI

On ANN-enhanced positive invariance for nonlinear flat systems

Huu-Thinh Do, Ionela Prodan · 2026

The concept of positively invariant (PI) sets has proven effective in the formal verification of stability and safety properties for autonomous systems. However, the characterization of such sets is c…

Read Paper →

Engineering Preprint PDF DOI

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

Aristeidis Papadopoulos, Rishabh Jain, Naomi Harte · 2026

Audio-Visual Speech Recognition (AVSR) systems nowadays integrate Large Language Model (LLM) decoders with transformer-based encoders, achieving state-of-the-art results. However, the relative contrib…

Read Paper →

Engineering Preprint PDF DOI

HARNESS: Lightweight Distilled Arabic Speech Foundation Models

Vrunda N. Sukhadia, Shammur Absar Chowdhury · 2026

Large self-supervised speech (SSL) models achieve strong downstream performance, but their size limits deployment in resource-constrained settings. We present HArnESS, an Arabic-centric self-supervise…

Read Paper →

Engineering Preprint PDF DOI

Acoustic-to-articulatory Inversion of the Complete Vocal Tract from RT-MRI with Various Audio Embeddings and Dataset Sizes

Sofiane Azzouz, Pierre-Andre Vuissoz, Yves Laprie · 2026

Articulatory-to-acoustic inversion strongly depends on the type of data used. While most previous studies rely on EMA, which is limited by the number of sensors and restricted to accessible articulato…

Read Paper →

Engineering Preprint PDF DOI

D-SPEAR: Dual-Stream Prioritized Experience Adaptive Replay for Stable Reinforcement Learning in Robotic Manipulation

Yu Zhang, Karl Mason · 2026

Robotic manipulation remains challenging for reinforcement learning due to contact-rich dynamics, long horizons, and training instability. Although off-policy actor-critic algorithms such as SAC and T…

Read Paper →

Engineering Preprint PDF DOI

Explainable Speech Emotion Recognition: Weighted Attribute Fairness to Model Demographic Contributions to Social Bias

Tomisin Ogunnubi, Yupei Li, Bjorn Schuller · 2026

Speech Emotion Recognition (SER) systems have growing applications in sensitive domains such as mental health and education, where biased predictions can cause harm. Traditional fairness metrics, such…

Read Paper →

Engineering Preprint PDF DOI

Outlier-Resistant Fusion for Multi-static Positioning using 5G NR Signals

Maximiliano Rivera Figueroa, Jannis Held, Pradyumna Kumar Bishoyi, Marina Petrova · 2026

Indoor positioning faces ongoing challenges due to complex propagation conditions, such as multipath propagation, signal blockages, and intrinsic target characteristics that substantially impact measu…

Read Paper →

Engineering Preprint PDF DOI

Linearized Bregman Iterations for Sparse Spiking Neural Networks

Daniel Windhager, Bernhard A. Moser, Michael Lunglmayr · 2026

Spiking Neural Networks (SNNs) offer an energy efficient alternative to conventional Artificial Neural Networks (ANNs) but typically still require a large number of parameters. This work introduces Li…

Read Paper →

Engineering Preprint PDF DOI

A Learnable SIM Paradigm: Fundamentals, Training Techniques, and Applications

Hetong Wang, Yashuai Cao, Tiejun Lv · 2026

Stacked intelligent metasurfaces (SIMs) represent a breakthrough in wireless hardware by comprising multilayer, programmable metasurfaces capable of analog computing in the electromagnetic (EM) wave d…

Read Paper →

Engineering Preprint PDF DOI

Performance Bounds and Robust Filtering for LEO Inter-Satellite Synchronization under Cross-Epoch Doppler Coupling

Haofan Dong, Houtianfu Wang, Hanlin Cai, Ozgur B. Akan · 2026

Low Earth orbit (LEO) inter-satellite links (ISLs) must achieve joint synchronization and ranging under severe hardware impairments, namely oscillator phase noise, clock drift, and measurement outlier…

Read Paper →

Engineering Preprint PDF DOI

Bootstrapping Audiovisual Speech Recognition in Zero-AV-Resource Scenarios with Synthetic Visual Data

Pol Buitrago, Pol Galvez, Oriol Pareras, Javier Hernando · 2026

Audiovisual speech recognition (AVSR) combines acoustic and visual cues to improve transcription robustness under challenging conditions but remains out of reach for most under-resourced languages due…

Read Paper →

Engineering Preprint PDF DOI

Quantifying Cross-Lingual Transfer in Paralinguistic Speech Tasks

Pol Buitrago, Oriol Pareras, Federico Costa, Javier Hernando · 2026

Paralinguistic speech tasks are often considered relatively language-agnostic, as they rely on extralinguistic acoustic cues rather than lexical content. However, prior studies report performance degr…

Read Paper →

Engineering Preprint PDF DOI

Statistical-Geometric Degeneracy in UAV Search: A Physics-Aware Asymmetric Filtering Approach

Zhiyuan Ren, Yudong Fang, Tao Zhang, Wenchi Cheng, Ben Lan · 2026

Post-disaster survivor localization using Unmanned Aerial Vehicles (UAVs) faces a fundamental physical challenge: the prevalence of Non-Line-of-Sight (NLOS) propagation in collapsed structures. Unlike…

Read Paper →

Engineering Preprint PDF DOI

HuPER: A Human-Inspired Framework for Phonetic Perception

Chenxu Guo, Jiachen Lian, Yisi Liu, Baihe Huang, Shriyaa Narayanan, Cheol Jun Cho, Gopala Anumanchipalli · 2026

We propose HuPER, a human-inspired framework that models phonetic perception as adaptive inference over acoustic-phonetics evidence and linguistic knowledge. With only 100 hours of training data, HuPE…

Read Paper →

Engineering Preprint PDF DOI

Soft Clustering Anchors for Self-Supervised Speech Representation Learning in Joint Embedding Prediction Architectures

Georgios Ioannides, Adrian Kieback, Judah Goldfeder, Linsey Pang, Aman Chadha, Aaron Elkins, Yann LeCun, Ravid Shwartz-Ziv · 2026

Joint Embedding Predictive Architectures (JEPA) offer a promising approach to self-supervised speech representation learning, but suffer from representation collapse without explicit grounding. We pro…

Read Paper →

Engineering Preprint PDF DOI

Spiking Neural Networks for Communication Systems: Encoding Schemes, Learning Algorithms, and Equalization~Techniques

Eike-Manuel Edelmann · 2026

Machine learning with artificial neural networks (ANNs), provides solutions for the growing complexity of modern communication systems. This complexity, however, increases power consumption, making th…

Read Paper →

Engineering Preprint PDF DOI

Event-based Heterogeneous Information Processing for Online Vision-based Obstacle Detection and Localization

Reza Ahmadvand, Sarah Safura Sharif, Yaser Mike Banad · 2026

This paper introduces a novel framework for robotic vision-based navigation that integrates Hybrid Neural Networks (HNNs) with Spiking Neural Network (SNN)-based filtering to enhance situational aware…

Read Paper →

Browse Research Papers

BUT System Description for CHiME-9 MCoRec Challenge

SPG-Codec: Exploring the Role and Boundaries of Semantic Priors in Ultra-Low-Bitrate Neural Speech Coding

Distributed Snitch Digital Twin-Based Anomaly Detection for Smart Voltage Source Converter-Enabled Wind Power Systems

On ANN-enhanced positive invariance for nonlinear flat systems

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

HARNESS: Lightweight Distilled Arabic Speech Foundation Models

Acoustic-to-articulatory Inversion of the Complete Vocal Tract from RT-MRI with Various Audio Embeddings and Dataset Sizes

D-SPEAR: Dual-Stream Prioritized Experience Adaptive Replay for Stable Reinforcement Learning in Robotic Manipulation

Explainable Speech Emotion Recognition: Weighted Attribute Fairness to Model Demographic Contributions to Social Bias

Outlier-Resistant Fusion for Multi-static Positioning using 5G NR Signals

Linearized Bregman Iterations for Sparse Spiking Neural Networks

A Learnable SIM Paradigm: Fundamentals, Training Techniques, and Applications

Performance Bounds and Robust Filtering for LEO Inter-Satellite Synchronization under Cross-Epoch Doppler Coupling

Bootstrapping Audiovisual Speech Recognition in Zero-AV-Resource Scenarios with Synthetic Visual Data

Quantifying Cross-Lingual Transfer in Paralinguistic Speech Tasks

Statistical-Geometric Degeneracy in UAV Search: A Physics-Aware Asymmetric Filtering Approach

HuPER: A Human-Inspired Framework for Phonetic Perception

Soft Clustering Anchors for Self-Supervised Speech Representation Learning in Joint Embedding Prediction Architectures

Spiking Neural Networks for Communication Systems: Encoding Schemes, Learning Algorithms, and Equalization~Techniques

Event-based Heterogeneous Information Processing for Online Vision-based Obstacle Detection and Localization

Browse by Category

Research Type

Publish Your Research