Martin Bues in Engineering — Research Repository

Engineering Preprint PDF DOI

Bitwise Over-Parameterized Neural Polar Decoding: A Theoretical Performance Analysis

Hongzhi Zhu, Wei Xu, Xiaohu You · 2026

This paper proposes a bitwise over-parameterized neural network (ONN) decoder for polar-coded transmission and develops a tractable theoretical performance analysis framework. By modeling each synthes…

Read Paper →

Engineering Preprint PDF DOI

SASI: Leveraging Sub-Action Semantics for Robust Early Action Recognition in Human-Robot Interaction

Yongpeng Cao, Masahiro Hirano, Hyuno Kim, Yuji Yamakawa · 2026

Understanding human actions is critical for advancing behavior analysis in human-robot interaction. Particularly in tasks that demand quick and proactive feedback, robots must recognize human actions …

Read Paper →

Engineering Preprint PDF DOI

BUT System Description for CHiME-9 MCoRec Challenge

Dominik Klement, Alexander Polok, Nguyen Hai Phong, Prachi Singh, Lukas Burget · 2026

Multi-talker automatic speech recognition (ASR) in conversational recordings remains an open problem, particularly in scenarios with large portion of overlapping speech where identifying and transcrib…

Read Paper →

Engineering Preprint PDF DOI

Representative Spectral Correlation Network for Multi-source Remote Sensing Image Classification

Chuanzheng Gong, Feng Gao, Junyan Lin, Junyu Dong, Qian Du · 2026

Hyperspectral image (HSI) and SAR/LiDAR data offer complementary spectral and structural information for land-cover classification. However, their effective fusion remains challenging due to two major…

Read Paper →

Engineering Preprint PDF DOI

Learning Tactile-Aware Quadrupedal Loco-Manipulation Policies

Pokuang Zhou, Yuhao Zhou, Quan Luu, Seungho Han, Heng Zhang, Binghao Huang, Yunzhu Li, Arash Ajoudani, Zhengtong Xu, Yu She · 2026

Quadrupedal loco-manipulation is commonly built on visual perception and proprioception. Yet reliable contact-rich manipulation remains difficult: vision and proprioception alone cannot resolve uncert…

Read Paper →

Engineering Preprint PDF DOI

The False Resonance: A Critical Examination of Emotion Embedding Similarity for Speech Generation Evaluation

Yun-Shao Tsai, Yi-Cheng Lin, Huang-Cheng Chou, Tzu-Wen Hsu, Yun-Man Hsu, Chun Wei Chen, Shrikanth Narayanan, Hung-yi Lee · 2026

Objective metrics for emotional expressiveness are vital for speech generation, particularly in expressive synthesis and voice conversion requiring emotional prosody transfer. To quantify this, the fi…

Read Paper →

Engineering Preprint PDF DOI

Dual-LoRA: Parameter-Efficient Adversarial Disentanglement for Cross-Lingual Speaker Verification

Qituan Shangguan, Junhao Du, Kunyang Peng, Feng Xue, Hui Zhang, Xinsheng Wang, Kai Yu, Shuai Wang · 2026

Cross-lingual speaker verification suffers from severe language-speaker entanglement. This causes systematic degradation in the hardest scenario: correctly accepting utterances from the same speaker a…

Read Paper →

Engineering Preprint PDF DOI

Local Shifted Passivity Analysis of the Single-Machine Infinite-Bus System

Xinyuan Jiang · 2026

This letter presents a shifted passivity analysis of the single-machine infinite-bus system in the stationary ($\alpha\beta$) reference frame. We study the attractivity of a periodic synchronous stead…

Read Paper →

Engineering Preprint PDF DOI

Robust Graph Matching through Semantic Relationship Generation for SLAM

David Perez-Saura, Jose Andres Millan-Romera, Miguel Fernandez-Cortizas, Holger Voos, Pascual Campoy, Jose Luis Sanchez-Lopez · 2026

Graph-based representations such as Scene Graphs enable localization in structured indoor environments by matching a locally observed graph, constructed from sensor data, to a prior map. This process …

Read Paper →

Engineering Preprint PDF DOI

Robust Accent Identification via Voice Conversion and Non-Timbral Embeddings

Rayane Bakari, Olivier Le Blouch, Nicolas Gengembre, Nicholas Evans · 2026

Automatic accent identification (AID) remains a challenging task due to the complex variability of accents, the entanglement of accent cues with speaker traits, and the scarcity of reliable accentlabe…

Read Paper →

Engineering Preprint PDF DOI

A Novel Two-Step Approach for Reactive Power Demand Calculation Using Integrated Voltage Stability Analysis

Hassan Abouelgheit, Hendrik Lens · 2026

The assessment of reactive power demand plays an instrumental role in power system planning. This paper presents a methodology for calculating reactive power demand based on a two-step approach. Unlik…

Read Paper →

Engineering Preprint PDF DOI

Passage-Aware Structural Mapping for RGB-D Visual SLAM

Ali Tourani, Miguel Fernandez-Cortizas, Saad Ejaz, David Perez Saura, Asier Bikandi-Noya, Jose Luis Sanchez-Lopez, Holger Voos · 2026

Doorways and passages are critical structural elements for indoor robot navigation, yet they remain underexplored in modern Visual SLAM (VSLAM) frameworks. This paper presents a passage-aware structur…

Read Paper →

Engineering Preprint PDF DOI

VLM-VPI: A Vision-Language Reasoning Framework for Improving Automated Vehicle-Pedestrian Interactions

Qingwen Pu, Kun Xie, Yuxiang Liu · 2026

Autonomous driving systems often infer pedestrian yielding behavior from geometric and kinematic cues alone, limiting their ability to reason about visual scene context and age-dependent behavioral va…

Read Paper →

Engineering Preprint PDF DOI

Move-Then-Operate: Behavioral Phasing for Human-Like Robotic Manipulation

Haoming Xu, Lei Lei, Jie Gu, Chu Tang, Jingmin Chen, Ruiqi Wang · 2026

We present Move-Then-Operate, a Vision language action framework that explicitly decouples robotic manipulation into two distinct behavioral phases: coarse relocation (move) and contact-critical inter…

Read Paper →

Engineering Preprint PDF DOI

Adaptive Spatial-Temporal Graph Learning-Enabled Short-Term Voltage Stability Assessment against Time-Varying Topological Conditions

Chao Deng, Lipeng Zhu, Chang Liu, Hefeng Zhai, Baoye Tian, Zexiang Zhu, Jiayong Li, Cong Zhang · 2026

The emerging deep learning (DL) technology has recently exhibited great potential in data-driven short-term voltage stability (SVS) assessment of complex power grids. However, without sufficient atten…

Read Paper →

Engineering Preprint PDF DOI

DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models

Li Li, Ming Cheng, Weixin Zhu, Yannan Wang, Juan Liu, Ming Li · 2026

Multi-speaker automatic speech recognition (ASR) aims to transcribe conversational speech involving multiple speakers, requiring the model to capture not only what was said, but also who said it and s…

Read Paper →

Engineering Preprint PDF DOI

Beyond Acoustic Sparsity and Linguistic Bias: A Prompt-Free Paradigm for Mispronunciation Detection and Diagnosis

Haopeng Geng, Longfei Yang, Xi Chen, Haitong Sun, Daisuke Saito, Nobuaki Minematsu · 2026

Mispronunciation Detection and Diagnosis (MDD) requires modeling fine-grained acoustic deviations. However, current ASR-derived MDD systems often face inherent limitations. In particular, CTC-based mo…

Read Paper →

Engineering Preprint PDF DOI

Robust Localization for Autonomous Vehicles in Highway Scenes

Daqian Cheng, Xuchu Ding, Yujia Wu, Xiang Zhang, Lei Wang · 2026

Localization for autonomous vehicles on highways remains under-explored compared to urban roads, and state-of-the-art methods for urban scenes degrade when directly applied to highways. We identify ke…

Read Paper →

Engineering Preprint PDF DOI

Analytical PI Tuning for Second-Order Plants with Monotonic Response and Minimum Settling Time

Senol Gulgonul · 2026

Background: Tuning proportional-integral (PI) controllers for second-order plants to achieve monotonic step response with minimum settling time is an important problem in analytical control design. Ex…

Read Paper →

Engineering Preprint PDF DOI

CorridorVLA: Explicit Spatial Constraints for Generative Action Heads via Sparse Anchors

Dachong Li, ZhuangZhuang Chen, Jin Zhang, Jianqiang Li · 2026

Vision--Language--Action (VLA) models often use intermediate representations to connect multimodal inputs with continuous control, yet spatial guidance is often injected implicitly through latent feat…

Read Paper →

Browse Research Papers

Bitwise Over-Parameterized Neural Polar Decoding: A Theoretical Performance Analysis

SASI: Leveraging Sub-Action Semantics for Robust Early Action Recognition in Human-Robot Interaction

BUT System Description for CHiME-9 MCoRec Challenge

Representative Spectral Correlation Network for Multi-source Remote Sensing Image Classification

Learning Tactile-Aware Quadrupedal Loco-Manipulation Policies

The False Resonance: A Critical Examination of Emotion Embedding Similarity for Speech Generation Evaluation

Dual-LoRA: Parameter-Efficient Adversarial Disentanglement for Cross-Lingual Speaker Verification

Local Shifted Passivity Analysis of the Single-Machine Infinite-Bus System

Robust Graph Matching through Semantic Relationship Generation for SLAM

Robust Accent Identification via Voice Conversion and Non-Timbral Embeddings

A Novel Two-Step Approach for Reactive Power Demand Calculation Using Integrated Voltage Stability Analysis

Passage-Aware Structural Mapping for RGB-D Visual SLAM

VLM-VPI: A Vision-Language Reasoning Framework for Improving Automated Vehicle-Pedestrian Interactions

Move-Then-Operate: Behavioral Phasing for Human-Like Robotic Manipulation

Adaptive Spatial-Temporal Graph Learning-Enabled Short-Term Voltage Stability Assessment against Time-Varying Topological Conditions

DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models

Beyond Acoustic Sparsity and Linguistic Bias: A Prompt-Free Paradigm for Mispronunciation Detection and Diagnosis

Robust Localization for Autonomous Vehicles in Highway Scenes

Analytical PI Tuning for Second-Order Plants with Monotonic Response and Minimum Settling Time

CorridorVLA: Explicit Spatial Constraints for Generative Action Heads via Sparse Anchors

Browse by Category

Research Type

Publish Your Research