Recognition in Engineering — Research Repository

Engineering Preprint PDF DOI

XR-CareerAssist: An Immersive Platform for Personalised Career Guidance Leveraging Extended Reality and Multimodal AI

N.D. Tantaroudas, A.J. McCracken, I. Karachalios, E. Papatheou, V. Pastrikakis · 2026

Conventional career guidance platforms rely on static, text-driven interfaces that struggle to engage users or deliver personalised, evidence-based insights. Although Computer-Assisted Career Guidance…

Read Paper →

Engineering Preprint PDF DOI

Deep Hierarchical Knowledge Loss for Fault Intensity Diagnosis

Yu Sha, Shuiping Gou, Bo Liu, Haofan Lu, Ningtao Liu, Jiahui Fu, Horst Stoecker, Domagoj Vnucec, Nadine Wetzstein, Andreas Widl, Kai Zhou · 2026

Fault intensity diagnosis (FID) plays a pivotal role in intelligent manufacturing while neglecting dependencies among target classes hinders its practical deployment. This paper introduces a novel and…

Read Paper →

Engineering Preprint PDF DOI

G-AMC: A Green Automatic Modulation Classification Method

Chee-An Yu, Young-Kai Chen, C.-C. Jay Kuo · 2026

In this work, we propose an efficient and transparent green learning pipeline to address the automatic modulation classification (AMC) problem. This pipeline aims to enable receivers to blindly identi…

Read Paper →

Engineering Preprint PDF DOI

Adaptive Material Fingerprinting for the fast discovery of polyconvex feature combinations in isotropic and anisotropic hyperelasticity

Moritz Flaschel, Hagen Holthusen, Denisa Martonova, Ellen Kuhl · 2026

We recently proposed a method called Material Fingerprinting for the rapid discovery of mechanical material models that avoids solving continuous optimization problems. Material Fingerprinting assumes…

Read Paper →

Engineering Preprint PDF DOI

INTERACT: An AI-Driven Extended Reality Framework for Accesible Communication Featuring Real-Time Sign Language Interpretation and Emotion Recognition

Nikolaos D. Tantaroudas, Andrew J. McCracken, Ilias Karachalios, Evangelos Papatheou · 2026

Video conferencing has become central to professional collaboration, yet most platforms offer limited support for deaf, hard-of-hearing, and multilingual users. The World Health Organisation estimates…

Read Paper →

Engineering Preprint PDF DOI

AI-Driven Modular Services for Accessible Multilingual Education in Immersive Extended Reality Settings: Integrating Speech Processing, Translation, and Sign Language Rendering

N.D. Tantaroudas, A.J. McCracken, I. Karachalios, E. Papatheou · 2026

This work introduces a modular platform that brings together six AI services, automatic speech recognition via OpenAI Whisper, multilingual translation through Meta NLLB, speech synthesis using AWS Po…

Read Paper →

Engineering Preprint PDF DOI

Activity Recognition Using mm-Wave Radar and Deep Learning: Prayer Tracker Case Study

Karim Saifullin, Sajid Ahmed, Mohamed-Slim Alouini · 2026

The issue of privacy has gained significant attention in recent times. Many real-world applications increasingly require the use of sensitive data, such as in surveillance or tracking and assistance s…

Read Paper →

Engineering Preprint PDF DOI

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

Zhennan Lin, Shuai Wang, Zhaokai Sun, Pengyuan Xie, Chuan Xie, Jie Liu, Qiang Zhang, Lei Xie · 2026

Transcribing and understanding multi-speaker conversations requires speech recognition, speaker attribution, and timestamp localization. While speech LLMs excel at single-speaker tasks, multi-speaker …

Read Paper →

Engineering Preprint PDF DOI

Foundation Models Defining A New Era In Sensor-based Human Activity Recognition: A Survey And Outlook

Sizhen Bian, Mengxi Liu, Lala Shakti Swarup Ray, Bo Zhou, Bin Guo, Zhiwen Yu, Thomas Ploetz, Paul Lukowicz, Siyu Yuan, Vitor Fortes Rey · 2026

Sensor-based Human Activity Recognition (HAR) underpins many ubiquitous and wearable computing applications, yet current models remain limited by scarce labels, sensor heterogeneity, and weak generali…

Read Paper →

Engineering Preprint PDF DOI

A ROS 2 Wrapper for Florence-2: Multi-Mode Local Vision-Language Inference for Robotic Systems

J. E. Dominguez-Vidal · 2026

Foundation vision-language models are becoming increasingly relevant to robotics because they can provide richer semantic perception than narrow task-specific pipelines. However, their practical adopt…

Read Paper →

Engineering Preprint PDF DOI

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

Aristeidis Papadopoulos, Rishabh Jain, Naomi Harte · 2026

Audio-Visual Speech Recognition (AVSR) systems nowadays integrate Large Language Model (LLM) decoders with transformer-based encoders, achieving state-of-the-art results. However, the relative contrib…

Read Paper →

Engineering Preprint PDF DOI

HARNESS: Lightweight Distilled Arabic Speech Foundation Models

Vrunda N. Sukhadia, Shammur Absar Chowdhury · 2026

Large self-supervised speech (SSL) models achieve strong downstream performance, but their size limits deployment in resource-constrained settings. We present HArnESS, an Arabic-centric self-supervise…

Read Paper →

Engineering Preprint PDF DOI

Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition

Lukuang Dong, Ziwei Li, Saierdaer Yusuyin, Xianyu Zhao, Zhijian Ou · 2026

Phoneme-based ASR factorizes recognition into speech-to-phoneme (S2P) and phoneme-to-grapheme (P2G), enabling cross-lingual acoustic sharing while keeping language-specific orthography in a separate m…

Read Paper →

Engineering Preprint PDF DOI

Generalizable Dense Reward for Long-Horizon Robotic Tasks

Silong Yong, Stephen Sheng, Carl Qi, Xiaojie Wang, Evan Sheehan, Anurag Shivaprasad, Yaqi Xie, Katia Sycara, Yesh Dattatreya · 2026

Existing robotic foundation policies are trained primarily via large-scale imitation learning. While such models demonstrate strong capabilities, they often struggle with long-horizon tasks due to dis…

Read Paper →

Engineering Preprint PDF DOI

Semantic Sensing: A Task-Oriented Paradigm

Xiaoqi Zhang, J. Andrew Zhang, Chang Liu, Weijie Yuan, Geoffrey Ye Li, Moeness G. Amin · 2026

Sensing and communication are fundamental enablers of next-generation networks. While communication technologies have advanced significantly, sensing remains limited to conventional parameter estimati…

Read Paper →

Engineering Preprint PDF DOI

EBuddy: a workflow orchestrator for industrial human-machine collaboration

Michele Banfi, Rocco Felici, Stefano Baraldo, Oliver Avram, Anna Valente · 2026

This paper presents EBuddy, a voice-guided workflow orchestrator for natural human-machine collaboration in industrial environments. EBuddy targets a recurrent bottleneck in tool-intensive workflows: …

Read Paper →

Engineering Preprint PDF DOI

Driving Condition-Aware Multi-Agent Integrated Power and Thermal Management for Hybrid Electric Vehicles

Hanghang Cui, Arash Khalatbarisoltani, Jie Han, Wenxue Liu, Muhammad Saeed, Xiaosong Hu · 2026

Effective co-optimization of energy management strategy (EMS) and thermal management (TM) is crucial for optimizing fuel efficiency in hybrid electric vehicles (HEVs). Driving conditions significantly…

Read Paper →

Engineering Preprint PDF DOI

Generalizable task-oriented object grasping through LLM-guided ontology and similarity-based planning

Hao Chen, Takuya Kiyokawa, Weiwei Wan, Kensuke Harada · 2026

Task-oriented grasping (TOG) is more challenging than simple object grasping because it requires precise identification of object parts and careful selection of grasping areas to ensure effective and …

Read Paper →

Engineering Preprint PDF DOI

Dual-branch Graph Domain Adaptation for Cross-scenario Multi-modal Emotion Recognition

Yuntao Shou, Jun Zhou, Tao Meng, Wei Ai, Keqin Li · 2026

Multimodal Emotion Recognition in Conversations (MERC) aims to predict speakers' emotional states in multi-turn dialogues through text, audio, and visual cues. In real-world settings, conversation sce…

Read Paper →

Engineering Preprint PDF DOI

Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned

Maeva Guerrier, Karthik Soma, Jana Pavlasek, Giovanni Beltrame · 2026

Visual Navigation Models (VNMs) promise generalizable, robot navigation by learning from large-scale visual demonstrations. Despite growing real-world deployment, existing evaluations rely almost excl…

Read Paper →

Browse Research Papers

XR-CareerAssist: An Immersive Platform for Personalised Career Guidance Leveraging Extended Reality and Multimodal AI

Deep Hierarchical Knowledge Loss for Fault Intensity Diagnosis

G-AMC: A Green Automatic Modulation Classification Method

Adaptive Material Fingerprinting for the fast discovery of polyconvex feature combinations in isotropic and anisotropic hyperelasticity

INTERACT: An AI-Driven Extended Reality Framework for Accesible Communication Featuring Real-Time Sign Language Interpretation and Emotion Recognition

AI-Driven Modular Services for Accessible Multilingual Education in Immersive Extended Reality Settings: Integrating Speech Processing, Translation, and Sign Language Rendering

Activity Recognition Using mm-Wave Radar and Deep Learning: Prayer Tracker Case Study

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

Foundation Models Defining A New Era In Sensor-based Human Activity Recognition: A Survey And Outlook

A ROS 2 Wrapper for Florence-2: Multi-Mode Local Vision-Language Inference for Robotic Systems

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

HARNESS: Lightweight Distilled Arabic Speech Foundation Models

Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition

Generalizable Dense Reward for Long-Horizon Robotic Tasks

Semantic Sensing: A Task-Oriented Paradigm

EBuddy: a workflow orchestrator for industrial human-machine collaboration

Driving Condition-Aware Multi-Agent Integrated Power and Thermal Management for Hybrid Electric Vehicles

Generalizable task-oriented object grasping through LLM-guided ontology and similarity-based planning

Dual-branch Graph Domain Adaptation for Cross-scenario Multi-modal Emotion Recognition

Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned

Browse by Category

Research Type

Publish Your Research