Visual Perception in Engineering — Research Repository

Engineering Preprint PDF DOI

Hybrid Architecture Gets Fluid: A New Paradigm for Direction-of-arrival Estimation in 6G Networks

Ye Tian, Jiaji Ren, Tuo Wu, Wei Liu, Maged Elkashlan, Matthew C. Valenti, Naofal Al-Dhahir, Hing Cheung So · 2026

High-precision direction-of-arrival (DOA) estimation, as a key sensing capability for 6G-enabled applications such as autonomous driving and extended reality, is increasingly dependent on the effectiv…

Read Paper →

Engineering Preprint PDF DOI

RadarSplat-RIO: Indoor Radar-Inertial Odometry with Gaussian Splatting-Based Radar Bundle Adjustment

Pou-Chun Kung, Yuan Tian, Zhengqin Li, Yue Liu, Eric Whitmire, Wolf Kienzle, Hrvoje Benko · 2026

Radar is more resilient to adverse weather and lighting conditions than visual and Lidar simultaneous localization and mapping (SLAM). However, most radar SLAM pipelines still rely heavily on frame-to…

Read Paper →

Engineering Preprint PDF DOI

RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception

Jiahao Ma, Qiang Zhang, Peiran Liu, Zeran Su, Pihai Sun, Gang Han, Wen Zhao, Wei Cui, Zhang Zhang, Zhiyuan Xu, Renjing Xu, Jian Tang, Miaomiao Liu, Yijie Guo · 2026

Surround-view perception is increasingly important for robotic navigation and loco-manipulation, especially in human-in-the-loop settings such as teleoperation, data collection, and emergency takeover…

Read Paper →

Engineering Preprint PDF DOI

Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators

Sreejani Chatterjee, Venkatesh Mullur, Abhinav Gandhi, Berk Calli · 2026

In this paper we present a novel visual servoing framework to control a robotic manipulator in the configuration space by using purely natural visual features. Our goal is to develop methods that can …

Read Paper →

Engineering Preprint PDF DOI

GeoVision-Enabled Digital Twin for Hybrid Autonomous-Teleoperated Medical Responses

Parham Kebria, Soheil Sabri, Laura J Brattain · 2026

Remote medical response systems are increasingly being deployed to support emergency care in disaster-affected and infrastructure-limited environments. Enabled by GeoVision capabilities, this paper pr…

Read Paper →

Engineering Preprint PDF DOI

XRZero-G0: Pushing the Frontier of Dexterous Robotic Manipulation with Interfaces, Quality and Ratios

James Wang, Primo Pu, Zephyr Fung, Alex Wang, Sam Wang, Bender Deng, Kevin Wang, Zivid Liu, Chris Pan, Panda Yang, Andy Zhai, Lucy Liang, Shalfun Li, Johnny Sun, Jacky Xu, Will Tian, Kai Yan, Kohler Ye, Scott Li, Qian Wang, Roy Gan, Hao Wang · 2026

The acquisition of high-quality, action-aligned demonstration data remains a fundamental bottleneck in scaling foundation models for dexterous robot manipulation. Although robot-free human demonstrati…

Read Paper →

Engineering Preprint PDF DOI

Learning Low-Dimensional Representation for O-RAN Testing via Transformer-ESN

Jiongyu Dai, Raymond Zhao, Farhad Rezazadeh, Lizhong Zheng, Haining Wang, Lingjia Liu · 2026

Open Radio Access Network (O-RAN) architectures enhance flexibility for 6G and NextG networks. However, it also brings significant challenges in O-RAN testing with evaluating abundant, high-dimensiona…

Read Paper →

Engineering Preprint PDF DOI

RMGS-SLAM: Real-time Multi-sensor Gaussian Splatting SLAM

Dongen Li, Yi Liu, Junqi Liu, Zewen Sun, Zefan Huang, Shuo Sun, Jiahui Liu, Chengran Yuan, Hongliang Guo, Francis E.H. Tay, Marcelo H. Ang Jr · 2026

Achieving real-time Simultaneous Localization and Mapping (SLAM) based on 3D Gaussian splatting (3DGS) in large-scale real-world environments remains challenging, as existing methods still struggle to…

Read Paper →

Engineering Preprint PDF DOI

DINO-Explorer: Active Underwater Discovery via Ego-Motion Compensated Semantic Predictive Coding

Yuhan Jin, Nayari Marie Lessa, Mariela De Lucas Alvarez, Melvin Laux, Lucas Amparo Barbosa, Frank Kirchner, Rebecca Adam · 2026

Marine ecosystem degradation necessitates continuous, scientifically selective underwater monitoring. However, most autonomous underwater vehicles (AUVs) operate as passive data loggers, capturing exh…

Read Paper →

Engineering Preprint PDF DOI

E2E-Fly: An Integrated Training-to-Deployment System for End-to-End Quadrotor Autonomy

Fangyu Sun, Fanxing Li, Linzuo Zhang, Yu Hu, Renbiao Jin, Shuyu Wu, Wenxian Yu, Danping Zou · 2026

Training and transferring learning-based policies for quadrotors from simulation to reality remains challenging due to inefficient visual rendering, physical modeling inaccuracies, unmodeled sensor di…

Read Paper →

Engineering Preprint PDF DOI

Robotic Manipulation is Vision-to-Geometry Mapping ($f(v) \rightarrow G$): Vision-Geometry Backbones over Language and Video Models

Zijian Song, Qichang Li, Jiawei Zhou, Zhenlong Yuan, Tianshui Chen, Liang Lin, Guangrun Wang · 2026

At its core, robotic manipulation is a problem of vision-to-geometry mapping ($f(v) \rightarrow G$). Physical actions are fundamentally defined by geometric properties like 3D positions and spatial re…

Read Paper →

Engineering Preprint PDF DOI

GGD-SLAM: Monocular 3DGS SLAM Powered by Generalizable Motion Model for Dynamic Environments

Yi Liu, Haoxuan Xu, Hongbo Duan, Keyu Fan, Zhengyang Zhang, Peiyu Zhuang, Pengting Luo, Houde Liu · 2026

Visual SLAM algorithms achieve significant improvements through the exploration of 3D Gaussian Splatting (3DGS) representations, particularly in generating high-fidelity dense maps. However, they depe…

Read Paper →

Engineering Preprint PDF DOI

VULCAN: Vision-Language-Model Enhanced Multi-Agent Cooperative Navigation for Indoor Fire-Disaster Response

Shengding Liu, Qiben Yan · 2026

Indoor fire disasters pose severe challenges to autonomous search and rescue due to dense smoke, high temperatures, and dynamically evolving indoor environments. In such time-critical scenarios, multi…

Read Paper →

Engineering Preprint PDF DOI

Actuation space reduction to facilitate insightful shape matching in a novel reconfigurable tendon driven continuum manipulator

Sabyasachi Dash, John Golden, Girish Krishnan · 2026

In tendon driven continuum manipulators (TDCMs), reconfiguring the tendon routing enables tailored spatial deformation of the backbone. This work presents a design in which tendons can be rerouted eit…

Read Paper →

Engineering Preprint PDF DOI

Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting

Ziyuan Xia, Jingyi Xu, Chong Cui, Yuanhong Yu, Jiazhao Zhang, Qingsong Yan, Tao Ni, Junbo Chen, Xiaowei Zhou, Hujun Bao, Ruizhen Hu, Sida Peng · 2026

Training embodied AI agents depends critically on the visual fidelity of simulation environments and the ability to model dynamic humans. Current simulators rely on mesh-based rasterization with limit…

Read Paper →

Engineering Preprint PDF DOI

Situation-Aware Feedback-Predictive Control Framework for Lane-Less Dense Traffic

Parthib Khound, Debraj Chakraborty · 2026

Navigating dense, lane-less traffic remains one of the most challenging scenarios for autonomous vehicles, especially in emerging regions where road structure and driver behavior are highly unpredicta…

Read Paper →

Engineering Preprint PDF DOI

Social Learning Strategies for Evolved Virtual Soft Robots

K. Ege de Bruin, Kyrre Glette, Kai Olav Ellefsen, Giorgia Nadizar, Eric Medvet · 2026

Optimizing the body and brain of a robot is a coupled challenge: the morphology determines what control strategies are effective, while the control parameters influence how well the morphology perform…

Read Paper →

Engineering Preprint PDF DOI

HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

Zixing Chen, Yifeng Gao, Li Wang, Yunhan Zhao, Yi Liu, Jiayu Li, Xiang Zheng, Zuxuan Wu, Cong Wang, Xingjun Ma, Yu-Gang Jiang · 2026

Vision-Language-Action (VLA) models inherit rich world knowledge from vision-language backbones and acquire executable skills via action demonstrations. However, existing evaluations largely focus on …

Read Paper →

Engineering Preprint PDF DOI

Room compensation for loudspeaker reproduction using a supporting source

James Brooks-Park, S{o}ren Bech, Jan {O}stergaard, Steven van de Par · 2026

Room compensation aims to improve the accuracy of loudspeaker reproduction in reverberant environments. Traditional methods, however, are limited to improving only spectral (timbral) and temporal accu…

Read Paper →

Engineering Preprint PDF DOI

An Ultra-Low Latency, End-to-End Streaming Speech Synthesis Architecture via Block-Wise Generation and Depth-Wise Codec Decoding

Tianhui Su, Tien-Ping Tan, Salima Mdhaffar, Yannick Esteve, Aghilas Sini · 2026

Real-time speech synthesis requires balancing inference latency and acoustic fidelity for interactive applications. Conventional continuous text-to-speech pipelines require computationally intensive n…

Read Paper →

Browse Research Papers

Hybrid Architecture Gets Fluid: A New Paradigm for Direction-of-arrival Estimation in 6G Networks

RadarSplat-RIO: Indoor Radar-Inertial Odometry with Gaussian Splatting-Based Radar Bundle Adjustment

RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception

Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators

GeoVision-Enabled Digital Twin for Hybrid Autonomous-Teleoperated Medical Responses

XRZero-G0: Pushing the Frontier of Dexterous Robotic Manipulation with Interfaces, Quality and Ratios

Learning Low-Dimensional Representation for O-RAN Testing via Transformer-ESN

RMGS-SLAM: Real-time Multi-sensor Gaussian Splatting SLAM

DINO-Explorer: Active Underwater Discovery via Ego-Motion Compensated Semantic Predictive Coding

E2E-Fly: An Integrated Training-to-Deployment System for End-to-End Quadrotor Autonomy

Robotic Manipulation is Vision-to-Geometry Mapping ($f(v) \rightarrow G$): Vision-Geometry Backbones over Language and Video Models

GGD-SLAM: Monocular 3DGS SLAM Powered by Generalizable Motion Model for Dynamic Environments

VULCAN: Vision-Language-Model Enhanced Multi-Agent Cooperative Navigation for Indoor Fire-Disaster Response

Actuation space reduction to facilitate insightful shape matching in a novel reconfigurable tendon driven continuum manipulator

Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting

Situation-Aware Feedback-Predictive Control Framework for Lane-Less Dense Traffic

Social Learning Strategies for Evolved Virtual Soft Robots

HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

Room compensation for loudspeaker reproduction using a supporting source

An Ultra-Low Latency, End-to-End Streaming Speech Synthesis Architecture via Block-Wise Generation and Depth-Wise Codec Decoding

Browse by Category

Research Type

Publish Your Research