Audris Mockus in Engineering — Research Repository

Engineering Preprint PDF DOI

OmniRobotHome: A Multi-Camera Platform for Real-Time Multiadic Human-Robot Interaction

Junyoung Lee, Sookwan Han, Jeonghwan Kim, Inhee Lee, Mingi Choi, Jisoo Kim, Wonjung Woo, Hanbyul Joo · 2026

Human-robot collaboration has been studied primarily in dyadic or sequential settings. However, real homes require multiadic collaboration, where multiple humans and robots share a workspace, acting c…

Read Paper →

Engineering Preprint PDF DOI

Sequential Inference for Gaussian Processes: A Signal Processing Perspective

Daniel Waxman, Fernando Llorente, Petar M. Djuric · 2026

The proliferation of capable and efficient machine learning (ML) models marks one of the strongest methodological shifts in signal processing (SP) in its nearly 100-year history. ML models support the…

Read Paper →

Engineering Preprint PDF DOI

Design and Characteristics of a Thin-Film ThermoMesh for the Efficient Embedded Sensing of a Spatio-Temporally Sparse Heat Source

Sajjad Boorghan Farahan, Ahmed Alajlouni, Jingzhou Zhao · 2026

This work presents ThermoMesh, a passive thin-film thermoelectric mesh sensor designed to detect and characterize spatio-temporally sparse heat sources through conduction-based thermal imaging. The de…

Read Paper →

Engineering Preprint PDF DOI

Experimental Performance of a 5G N78 Reconfigurable Intelligent Surface: From Controlled Measurements to Commercial Network Deployment

Sefa Kayrakl{i}k, Samed Kesir, Batuhan Kaplan, Ahmet Muaz Aktas, Emre Arslan, Ahmet Faruk Coskun · 2026

This paper presents a real-world experimental analysis of a modular reconfigurable intelligent surface (RIS) prototype designed to operate in the 5G N78 band. Unlike most RIS studies in the literature…

Read Paper →

Engineering Preprint PDF DOI

LRS-VoxMM: A benchmark for in-the-wild audio-visual speech recognition

Doyeop Kwak, Jeongsoo Choi, Suyeon Lee, Joon Son Chung · 2026

We introduce LRS-VoxMM, an in-the-wild benchmark for audio-visual speech recognition (AVSR). The benchmark is derived from VoxMM, a dataset of diverse real-world spoken conversations with human-annota…

Read Paper →

Engineering Preprint PDF DOI

On the Nesterov's acceleration: A NAIM perspective

Rachit Mehra, M Parimi, Amol Yerudkar, S.R. Wagh, Navdeep Singh · 2026

We present a unifying Nearly Asymptotically Invariant Manifold (NAIM) framework for understanding Nesterovs Accelerated Gradient (NAG) method. By lifting the first-order gradient flow into a second-or…

Read Paper →

Engineering Preprint PDF DOI

SASI: Leveraging Sub-Action Semantics for Robust Early Action Recognition in Human-Robot Interaction

Yongpeng Cao, Masahiro Hirano, Hyuno Kim, Yuji Yamakawa · 2026

Understanding human actions is critical for advancing behavior analysis in human-robot interaction. Particularly in tasks that demand quick and proactive feedback, robots must recognize human actions …

Read Paper →

Engineering Preprint PDF DOI

A Knowledge-Driven Approach to Target Speech Extraction in the Presence of Background Sound Effects for Cinematic Audio Source Separation (CASS)

Chun-wei Ho, Sabato Marco Siniscalchi, Kai Li, Chin-Hui Lee · 2026

We propose a knowledge-driven approach to speech target extraction in the presence of background sound effects already recorded in cinematic audio. The specific knowledge sources studied are manners o…

Read Paper →

Engineering Preprint PDF DOI

PALCAS: A Priority-Aware Intelligent Lane Change Advisory System for Autonomous Vehicles using Federated Reinforcement Learning

Yassine Ibork, Nhat Ha Nguyen, Myounggyu Won, Lokesh Das · 2026

We present a priority-aware intelligent lane change advisory system based on multi-agent federated reinforcement learning, namely PALCAS, for autonomous vehicles (AVs). While existing lane-change appr…

Read Paper →

Engineering Preprint PDF DOI

Similarity Choice and Negative Scaling in Supervised Contrastive Learning for Deepfake Audio Detection

Jaskirat Sudan, Hashim Ali, Surya Subramani, Hafiz Malik · 2026

Supervised contrastive learning (SupCon) is widely used to shape representations, but has seen limited targeted study for audio deepfake detection. Existing work typically combines contrastive terms w…

Read Paper →

Engineering Preprint PDF DOI

Step-Audio-R1.5 Technical Report

Yuxin Zhang, Xiangyu Tony Zhang, Daijiao Liu, Fei Tian, Yayue Deng, Jun Chen, Qingjian Lin, Haoyang Zhang, Yuxin Li, Jinglan Gong, Yechang Huang, Liang Zhao, Chengyuan Yao, Hexin Liu, Eng Siong Chng, Xuerui Yang, Gang Yu, Xiangyu Zhang, Daxin Jiang · 2026

Recent advancements in large audio language models have extended Chain-of-Thought (CoT) reasoning into the auditory domain, enabling models to tackle increasingly complex acoustic and spoken tasks. To…

Read Paper →

Engineering Preprint PDF DOI

Walking Through Uncertainty: An Empirical Study of Uncertainty Estimation for Audio-Aware Large Language Models

Chun-Yi Kuan, Wei-Ping Huang, Hung-yi Lee · 2026

Recent audio-aware large language models (ALLMs) have demonstrated strong capabilities across diverse audio understanding and reasoning tasks, but they still frequently produce hallucinated or overly …

Read Paper →

Engineering Preprint PDF DOI

A Novel Two-Step Approach for Reactive Power Demand Calculation Using Integrated Voltage Stability Analysis

Hassan Abouelgheit, Hendrik Lens · 2026

The assessment of reactive power demand plays an instrumental role in power system planning. This paper presents a methodology for calculating reactive power demand based on a two-step approach. Unlik…

Read Paper →

Engineering Preprint PDF DOI

Energy Efficiency Maximization for Discrete Activation based NOMA-assisted Pinching-Antenna Systems

Yishi Zhang, Aditya Powari, Kaidi Wang, Yaru Fu, Daniel K. C. So · 2026

Pinching-antenna systems (PASS) have recently attracted significant attention as a promising architecture for flexible and reconfigurable wireless communications. Despite notable advancements, researc…

Read Paper →

Engineering Preprint PDF DOI

SPLIT: Separating Physical-Contact via Latent Arithmetic in Image-Based Tactile Sensors

Wadhah Zai El Amri, Nicolas Navarro-Guerrero · 2026

Training machine learning models for robotic tactile sensing requires vast amounts of data, yet obtaining realistic interaction data remains a challenge due to physical complexity and variability. Sim…

Read Paper →

Engineering Preprint PDF DOI

AI-Native Autonomous Infrastructure (ANAI): A Formal Framework for the Next General-Purpose Technology

Hidir Selcuk Nogay · 2026

Artificial intelligence is increasingly described as a candidate next generation general purpose technology (GPT). However, existing interpretations predominantly emphasize performance scaling rather …

Read Paper →

Engineering Preprint PDF DOI

Modular Sensory Stream for Integrating Physical Feedback in Vision-Language-Action Models

Jimin Lee, Huiwon Jang, Myungkyu Koo, Jungwoo Park, Jinwoo Shin · 2026

Humans understand and interact with the real world by relying on diverse physical feedback beyond visual perception. Motivated by this, recent approaches attempt to incorporate physical sensory signal…

Read Paper →

Engineering Preprint PDF DOI

LeHome: A Simulation Environment for Deformable Object Manipulation in Household Scenarios

Zeyi Li, Yushi Yang, Shawn Xie, Kyle Xu, Tianxing Chen, Yuran Wang, Zhenhao Shen, Yan Shen, Yue Chen, Wenjun Li, Yukun Zheng, Chaorui Zhang, Siyi Lin, Fei Teng, Hongjun Yang, Ming Chen, Steve Xie, Ruihai Wu · 2026

Household environments present one of the most common, impactful yet challenging application domains for robotics. Within household scenarios, manipulating deformable objects is particularly difficult…

Read Paper →

Engineering Preprint PDF DOI

Audio Effect Estimation with DNN-Based Prediction and Search Algorithm

Youichi Okita, Haruhiro Katayose · 2026

Audio effects play an essential role in sound design. This research addresses the task of audio effect estimation, which aims to estimate the configuration of applied effects from a wet signal. Existi…

Read Paper →

Engineering Preprint PDF DOI

Listening with Time: Precise Temporal Awareness for Long-Form Audio Understanding

Mingchen Shao, Hang Su, Wenjie Tian, Bingshen Mu, Zhennan Lin, Lichun Fan, Zhenbo Luo, Jian Luan, Lei Xie · 2026

While Large Audio Language Models (LALMs) achieve strong performance on short audio, they degrade on long-form inputs. This degradation is more severe in temporal awareness tasks, where temporal align…

Read Paper →

Browse Research Papers

OmniRobotHome: A Multi-Camera Platform for Real-Time Multiadic Human-Robot Interaction

Sequential Inference for Gaussian Processes: A Signal Processing Perspective

Design and Characteristics of a Thin-Film ThermoMesh for the Efficient Embedded Sensing of a Spatio-Temporally Sparse Heat Source

Experimental Performance of a 5G N78 Reconfigurable Intelligent Surface: From Controlled Measurements to Commercial Network Deployment

LRS-VoxMM: A benchmark for in-the-wild audio-visual speech recognition

On the Nesterov's acceleration: A NAIM perspective

SASI: Leveraging Sub-Action Semantics for Robust Early Action Recognition in Human-Robot Interaction

A Knowledge-Driven Approach to Target Speech Extraction in the Presence of Background Sound Effects for Cinematic Audio Source Separation (CASS)

PALCAS: A Priority-Aware Intelligent Lane Change Advisory System for Autonomous Vehicles using Federated Reinforcement Learning

Similarity Choice and Negative Scaling in Supervised Contrastive Learning for Deepfake Audio Detection

Step-Audio-R1.5 Technical Report

Walking Through Uncertainty: An Empirical Study of Uncertainty Estimation for Audio-Aware Large Language Models

A Novel Two-Step Approach for Reactive Power Demand Calculation Using Integrated Voltage Stability Analysis

Energy Efficiency Maximization for Discrete Activation based NOMA-assisted Pinching-Antenna Systems

SPLIT: Separating Physical-Contact via Latent Arithmetic in Image-Based Tactile Sensors

AI-Native Autonomous Infrastructure (ANAI): A Formal Framework for the Next General-Purpose Technology

Modular Sensory Stream for Integrating Physical Feedback in Vision-Language-Action Models

LeHome: A Simulation Environment for Deformable Object Manipulation in Household Scenarios

Audio Effect Estimation with DNN-Based Prediction and Search Algorithm

Listening with Time: Precise Temporal Awareness for Long-Form Audio Understanding

Browse by Category

Research Type

Publish Your Research