Recognition — Research Repository

AI & Data Science Preprint PDF DOI

Qualitative Evaluation of Language Model Rescoring in Automatic Speech Recognition

Thibault Baneras-Roux, Mickael Rouvier, Jane Wottawa, Richard Dufour · 2026

Evaluating automatic speech recognition (ASR) systems is a classical but difficult and still open problem, which often boils down to focusing only on the word error rate (WER). However, this metric su…

Read Paper →

Engineering Preprint PDF DOI

SASI: Leveraging Sub-Action Semantics for Robust Early Action Recognition in Human-Robot Interaction

Yongpeng Cao, Masahiro Hirano, Hyuno Kim, Yuji Yamakawa · 2026

Understanding human actions is critical for advancing behavior analysis in human-robot interaction. Particularly in tasks that demand quick and proactive feedback, robots must recognize human actions …

Read Paper →

Computer Science Preprint PDF DOI

Examining discontinuance of AI-mediated informal digital learning of English (AI-IDLE) among university students: Evidence from SEM and fsQCA

Yiran Du, Huimin He · 2026

This study examined university students' discontinuance intention towards AI-mediated informal digital learning of English (AI-IDLE). Drawing on the cognition-affect-conation framework, the study inve…

Read Paper →

Computer Science Preprint PDF DOI

Why Learners Drift In and Out: Examining Intermittent Discontinuance in AI-Mediated Informal Digital English Learning (AI-IDLE) Using SEM and fsQCA

Yiran Du, Huimin He · 2026

This study examined intermittent discontinuance in AI-mediated informal digital learning of English (AI-IDLE) through the cognition-affect-conation framework. Survey data were collected from 632 Chine…

Read Paper →

Engineering Preprint PDF DOI

BUT System Description for CHiME-9 MCoRec Challenge

Dominik Klement, Alexander Polok, Nguyen Hai Phong, Prachi Singh, Lukas Burget · 2026

Multi-talker automatic speech recognition (ASR) in conversational recordings remains an open problem, particularly in scenarios with large portion of overlapping speech where identifying and transcrib…

Read Paper →

AI & Data Science Preprint PDF DOI

InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation?

Qiyao Wang, Haoran Hu, Longze Chen, Hongbo Wang, Hamid Alinejad-Rokny, Yuan Lin, Min Yang · 2026

With the advancement of multimodal large language models (MLLMs) and coding agents, the website development has shifted from manual programming to agent-based project-level code synthesis. Existing be…

Read Paper →

AI & Data Science Preprint PDF DOI

Detecting is Easy, Adapting is Hard: Local Expert Growth for Visual Model-Based Reinforcement Learning under Distribution Shift

Haiyang Zhao · 2026

Visual model-based reinforcement learning (MBRL) agents can perform well on the training distribution, but often break down once the test environment shifts. In visual MBRL, recognizing that a shift h…

Read Paper →

AI & Data Science Preprint PDF DOI

VeraRetouch: A Lightweight Fully Differentiable Framework for Multi-Task Reasoning Photo Retouching

Yihong Guo, Youwei Lyu, Jiajun Tang, Yizhuo Zhou, Hongliang Wang, Jinwei Chen, Changqing Zou, Qingnan Fan · 2026

Reasoning photo retouching has gained significant traction, requiring models to analyze image defects, give reasoning processes, and execute precise retouching enhancements. However, existing approach…

Read Paper →

AI & Data Science Preprint PDF DOI

Measurement Risk in Supervised Financial NLP: Rubric and Metric Sensitivity on JF-ICR

Sidi Chang, Peiying Zhu, Yuxiao Chen, Rongdong Chai · 2026

As LLMs become credible readers of earnings calls, investor-relations Q\&A, guidance, and disclosure language, supervised financial NLP benchmarks increasingly function as decision evidence for model …

Read Paper →

AI & Data Science Preprint PDF DOI

CasLayout: Cascaded 3D Layout Diffusion for Indoor Scene Synthesis with Implicit Relation Modeling

Yingrui Wu, Youkang Kong, Mingyang Zhao, Weize Quan, Dong-Ming Yan, Yang Liu · 2026

Synthesizing realistic 3D indoor scenes remains challenging due to data scarcity and the difficulty of simultaneously enforcing global architectural constraints and local semantic consistency. Existin…

Read Paper →

AI & Data Science Preprint PDF DOI

CoAX: Cognitive-Oriented Attribution eXplanation User Model of Human Understanding of AI Explanations

Louth Bin Rawshan, Zhuoyu Wang, Brian Y. Lim · 2026

Explainable AI (XAI) aims to improve user understanding and decisions when using AI models. However, despite innovations in XAI, recent user evaluations reveal that this goal remains elusive. Understa…

Read Paper →

AI & Data Science Preprint PDF DOI

Gait Recognition via Deep Residual Networks and Multi-Branch Feature Fusion

Yabo Luo, Xiaoyun Wang, Cunrong Li · 2026

Gait recognition has emerged as a compelling biometric modality for surveillance and security applications, offering inherent advantages such as non-intrusiveness, resistance to disguise, and long-ran…

Read Paper →

Physics Preprint PDF DOI

Multiresonant Membrane Metasurfaces for Multifunctional Fingerprint Recognition and Real-time Biochemical Tracking

Quanlong Yang, Yapeng Dou, Dongyang Wang, Yihua Zhong, Fei Li, Jiajun He, Ying Zhang, Quan Xu, Junliang Yang, Ilya Shadrivov, Jiaguang Han, Yuri Kivshar · 2026

Label-free identification and real-time tracking of biochemical substances became critical for molecular diagnostics and chemical analysis, yet conventional resonant terahertz metasurface sensing reli…

Read Paper →

AI & Data Science Preprint PDF DOI

Student Classroom Behavior Recognition Based on Improved YOLOv8s

Xiang Gao, Shuai Hang · 2026

In classroom teaching, student behavior can reflect their learning state and classroom participation, which is of great significance for teaching quality analysis. To address the problems of dense stu…

Read Paper →

Computer Science Preprint PDF DOI

Few-Shot Accent Synthesis for ASR with LLM-Guided Phoneme Editing

Yurii Halychanskyi, Nimet Beyza Bozdag, Mark Hasegawa-Johnson, Dilek Hakkani-Tur, Volodymyr Kindratenko · 2026

Accented automatic speech recognition (ASR) often degrades due to the limited availability of accented training data. Prior work has explored accent modeling in low-resource settings, but existing app…

Read Paper →

AI & Data Science Preprint PDF DOI

Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents

Anh Ta, Junjie Zhu, Shahin Shayandeh · 2026

Tool-calling agents are evaluated on tool selection, parameter accuracy, and scope recognition, yet LLM trajectory assessments remain inherently post-hoc. Disconnected from the active execution loop, …

Read Paper →

AI & Data Science Preprint PDF DOI

Targeted Linguistic Analysis of Sign Language Models with Minimal Translation Pairs

Serpil Karabuklu, Kanishka Misra, Shester Gueuwou, Diane Brentari, Greg Shakhnarovich, Karen Livescu · 2026

Models of sign language have historically lagged behind those for spoken language (text and speech). Recent work has greatly improved their performance on tasks like sign language translation and isol…

Read Paper →

AI & Data Science Preprint PDF DOI

AttriBE: Quantifying Attribute Expressivity in Body Embeddings for Recognition and Identification

Basudha Pal, Siyuan Huang, Anirudh Nanduri, Zhaoyang Wang, Rama Chellappa · 2026

Person re-identification (ReID) systems that match individuals across images or video frames are essential in many real-world applications. However, existing methods are often influenced by attributes…

Read Paper →

AI & Data Science Preprint PDF DOI

Selective Augmentation: Improving Universal Automatic Phonetic Transcription via G2P Bootstrapping

Tobias Bystrich, Julia M. Pritzen, Christoph A. Schmidt, Claudia Wich-Reif · 2026

In the field of universal automatic phonetic transcription (APT), clean and diverse training transcriptions are required. However, such high-quality data is limited. We propose the bootstrapping appro…

Read Paper →

AI & Data Science Preprint PDF DOI

Energy-Efficient Plant Monitoring via Knowledge Distillation

Ilyass Moummad, Reda Bensaid, Kawtar Zaher, Herve Goeau, Jean-Christophe Lombardo, Joseph Salmon, Pierre Bonnet, Alexis Joly · 2026

Recent advances in large-scale visual representation learning have significantly improved performance in plant species and plant disease recognition tasks. However, state-of-the-art models, often base…

Read Paper →

Browse Research Papers

Qualitative Evaluation of Language Model Rescoring in Automatic Speech Recognition

SASI: Leveraging Sub-Action Semantics for Robust Early Action Recognition in Human-Robot Interaction

Examining discontinuance of AI-mediated informal digital learning of English (AI-IDLE) among university students: Evidence from SEM and fsQCA

Why Learners Drift In and Out: Examining Intermittent Discontinuance in AI-Mediated Informal Digital English Learning (AI-IDLE) Using SEM and fsQCA

BUT System Description for CHiME-9 MCoRec Challenge

InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation?

Detecting is Easy, Adapting is Hard: Local Expert Growth for Visual Model-Based Reinforcement Learning under Distribution Shift

VeraRetouch: A Lightweight Fully Differentiable Framework for Multi-Task Reasoning Photo Retouching

Measurement Risk in Supervised Financial NLP: Rubric and Metric Sensitivity on JF-ICR

CasLayout: Cascaded 3D Layout Diffusion for Indoor Scene Synthesis with Implicit Relation Modeling

CoAX: Cognitive-Oriented Attribution eXplanation User Model of Human Understanding of AI Explanations

Gait Recognition via Deep Residual Networks and Multi-Branch Feature Fusion

Multiresonant Membrane Metasurfaces for Multifunctional Fingerprint Recognition and Real-time Biochemical Tracking

Student Classroom Behavior Recognition Based on Improved YOLOv8s

Few-Shot Accent Synthesis for ASR with LLM-Guided Phoneme Editing

Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents

Targeted Linguistic Analysis of Sign Language Models with Minimal Translation Pairs

AttriBE: Quantifying Attribute Expressivity in Body Embeddings for Recognition and Identification

Selective Augmentation: Improving Universal Automatic Phonetic Transcription via G2P Bootstrapping

Energy-Efficient Plant Monitoring via Knowledge Distillation

Browse by Category

Research Type

Publish Your Research