Visual Perception in Engineering — Research Repository

Engineering Preprint PDF DOI

RACF: A Resilient Autonomous Car Framework with Object Distance Correction

Chieh Tsai, Hossein Rastgoftar, Salim Hariri · 2026

Autonomous vehicles are increasingly deployed in safety-critical applications, where sensing failures or cyberphysical attacks can lead to unsafe operations resulting in human loss and/or severe physi…

Read Paper →

Engineering Preprint PDF DOI

Unveiling the Surprising Efficacy of Navigation Understanding in End-to-End Autonomous Driving

Zhihua Hua, Junli Wang, Pengfei LI, Qihao Jin, Bo Zhang, Kehua Sheng, Yilun Chen, Zhongxue Gan, Wenchao Ding · 2026

Global navigation information and local scene understanding are two crucial components of autonomous driving systems. However, our experimental results indicate that many end-to-end autonomous driving…

Read Paper →

Engineering Preprint PDF DOI

Robotic Nanoparticle Synthesis via Solution-based Processes

Dasharadhan Mahalingam, Michael Gallagher, Nilanjan Chakraborty, Stanislaus S. Wong · 2026

We present a screw geometry-based manipulation planning framework for the robotic automation of solution-based synthesis, exemplified through the preparation of gold and magnetite nanoparticles. The s…

Read Paper →

Engineering Preprint PDF DOI

Why Your Tokenizer Fails in Information Fusion: A Timing-Aware Pre-Quantization Fusion for Video-Enhanced Audio Tokenization

Xiangyu Zhang, Benjamin John Southwell, Siqi Pan, Xinlei Niu, Beena Ahmed, Julien Epps · 2026

Audio tokenization has emerged as a critical component in end-to-end audio language models, enabling efficient discrete representation learning for both audio understanding and generation tasks. Howev…

Read Paper →

Engineering Preprint PDF DOI

Parametric Interpolation of Dynamic Mode Decomposition for Predicting Nonlinear Systems

Ananda Chakrabarti, Haitham H. Saleh, Indranil Nayak, Balasubramaniam Shanker, Fernando L. Teixeira, Debdipta Goswami · 2026

We present parameter-interpolated dynamic mode decomposition (piDMD), a parametric reduced-order modeling framework that embeds known parameter-affine structure directly into the DMD regression step. …

Read Paper →

Engineering Preprint PDF DOI

3DRO: Lidar-level SE(3) Direct Radar Odometry Using a 2D Imaging Radar and a Gyroscope

Cedric Le Gentil, Daniil Lisus, Timothy D. Barfoot · 2026

Recently, the robotics community has regained interest in radar-based perception and state estimation. A 2D imaging radar provides dense 360deg information about the environment. Despite the radar ant…

Read Paper →

Engineering Preprint PDF DOI

ReefMapGS: Enabling Large-Scale Underwater Reconstruction by Closing the Loop Between Multimodal SLAM and Gaussian Splatting

Daniel Yang, Jungseok Hong, John J. Leonard, Yogesh Girdhar · 2026

3D Gaussian Splatting is a powerful visual representation, providing high-quality and efficient 3D scene reconstruction, but it is crucially dependent on accurate camera poses typically obtained from …

Read Paper →

Engineering Preprint PDF DOI

M2HRI: An LLM-Driven Multimodal Multi-Agent Framework for Personalized Human-Robot Interaction

Shaid Hasan, Breenice Lee, Sujan Sarker, Tariq Iqbal · 2026

Multi-robot systems hold significant promise for social environments such as homes and hospitals, yet existing multi-robot works treat robots as functionally identical, overlooking how robots individu…

Read Paper →

Engineering Preprint PDF DOI

Grounded World Model for Semantically Generalizable Planning

Quanyi Li, Lan Feng, Haonan Zhang, Wuyang Li, Letian Wang, Alexandre Alahi, Harold Soh · 2026

In Model Predictive Control (MPC), world models predict the future outcomes of various action proposals, which are then scored to guide the selection of the optimal action. For visuomotor MPC, the sco…

Read Paper →

Engineering Preprint PDF DOI

Dual-Control Frequency-Aware Diffusion Model for Depth-Dependent Optical Microrobot Microscopy Image Generation

Lan Wei, Zongcai Tan, Kangyi Lu, Jian-Qing Zheng, Dandan Zhang · 2026

Optical microrobots actuated by optical tweezers (OT) are important for cell manipulation and microscale assembly, but their autonomous operation depends on accurate 3D perception. Developing such per…

Read Paper →

Engineering Preprint PDF DOI

VLMaterial: Vision-Language Model-Based Camera-Radar Fusion for Physics-Grounded Material Identification

Jiangyou Zhu, He Chen · 2026

Accurate material recognition is a fundamental capability for intelligent perception systems to interact safely and effectively with the physical world. For instance, distinguishing visually similar o…

Read Paper →

Engineering Preprint PDF DOI

Micro-Dexterity in Biological Micromanipulation: Embodiment, Perception, and Control

Kangyi Lu, Lan Wei, Zongcai Tan, Dandan Zhang · 2026

Microscale manipulation has advanced substantially in controlled locomotion and targeted transport, yet many biomedical applications require precise and adaptive interaction with biological micro-obje…

Read Paper →

Engineering Preprint PDF DOI

Wideband Sensing with Dynamic Metasurface Antennas under Realistic Phase Response Modeling

Ioannis Gavras, George C. Alexandropoulos · 2026

This paper investigates the impact of practical features of the emerging antenna array technology of Dynamic Metasurface Antennas (DMAs) when used for wideband sensing. By adopting a realistic DMA res…

Read Paper →

Engineering Preprint PDF DOI

EagleVision: A Multi-Task Benchmark for Cross-Domain Perception in High-Speed Autonomous Racing

Zakhar Yagudin, Murad Mebrahtu, Ren Jin, Jiaqi Huang, Yujia Yue, Dzmitry Tsetserukou, Jorge Dias, Majid Khonji · 2026

High-speed autonomous racing presents extreme perception challenges, including large relative velocities and substantial domain shifts from conventional urban-driving datasets. Existing benchmarks do …

Read Paper →

Engineering Preprint PDF DOI

Minimal Embodiment Enables Efficient Learning of Number Concepts in Robot

Zhegong Shangguan, Alessandro Di Nuovo, Angelo Cangelosi · 2026

Robots are increasingly entering human-interactive scenarios that require understanding of quantity. How intelligent systems acquire abstract numerical concepts from sensorimotor experience remains a …

Read Paper →

Engineering Preprint PDF DOI

CLASP: Closed-loop Asynchronous Spatial Perception for Open-vocabulary Desktop Object Grasping

Yiran Ling, Wenxuan Li, Siying Dong, Yize Zhang, Xiaoyao Huang, Jing Jiang, Ruonan Li, Jie Liu · 2026

Robot grasping of desktop object is widely used in intelligent manufacturing, logistics, and agriculture.Although vision-language models (VLMs) show strong potential for robotic manipulation, their de…

Read Paper →

Engineering Preprint PDF DOI

Learning to Forget -- Hierarchical Episodic Memory for Lifelong Robot Deployment

Leonard Barmann, Joana Plewnia, Alex Waibel, Tamim Asfour · 2026

Robots must verbalize their past experiences when users ask "Where did you put my keys?" or "Why did the task fail?" Yet maintaining life-long episodic memory (EM) from continuous multimodal perceptio…

Read Paper →

Engineering Preprint PDF DOI

Toward Environment-Aware LAE: SAR as a Shared Sensing Infrastructure

Xue Zhang, Bang Huang, Mohamed-Slim Alouini · 2026

The rapid growth of the low-altitude economy (LAE) is making aerial systems an important part of future digital infrastructure. Although major advances have been achieved in unmanned aerial vehicle (U…

Read Paper →

Engineering Preprint PDF DOI

ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation

Arjun Bhardwaj, Maximum Wilder-Smith, Mayank Mittal, Vaishakh Patil, Marco Hutter · 2026

In-hand object reorientation requires precise estimation of the object pose to handle complex task dynamics. While RGB sensing offers rich semantic cues for pose tracking, existing solutions rely on m…

Read Paper →

Engineering Preprint PDF DOI

AIM: Intent-Aware Unified world action Modeling with Spatial Value Maps

Liaoyuan Fan, Zetian Xu, Chen Cao, Wenyao Zhang, Mingqi Yuan, Jiayu Chen · 2026

Pretrained video generation models provide strong priors for robot control, but existing unified world action models still struggle to decode reliable actions without substantial robot-specific traini…

Read Paper →

Browse Research Papers

RACF: A Resilient Autonomous Car Framework with Object Distance Correction

Unveiling the Surprising Efficacy of Navigation Understanding in End-to-End Autonomous Driving

Robotic Nanoparticle Synthesis via Solution-based Processes

Why Your Tokenizer Fails in Information Fusion: A Timing-Aware Pre-Quantization Fusion for Video-Enhanced Audio Tokenization

Parametric Interpolation of Dynamic Mode Decomposition for Predicting Nonlinear Systems

3DRO: Lidar-level SE(3) Direct Radar Odometry Using a 2D Imaging Radar and a Gyroscope

ReefMapGS: Enabling Large-Scale Underwater Reconstruction by Closing the Loop Between Multimodal SLAM and Gaussian Splatting

M2HRI: An LLM-Driven Multimodal Multi-Agent Framework for Personalized Human-Robot Interaction

Grounded World Model for Semantically Generalizable Planning

Dual-Control Frequency-Aware Diffusion Model for Depth-Dependent Optical Microrobot Microscopy Image Generation

VLMaterial: Vision-Language Model-Based Camera-Radar Fusion for Physics-Grounded Material Identification

Micro-Dexterity in Biological Micromanipulation: Embodiment, Perception, and Control

Wideband Sensing with Dynamic Metasurface Antennas under Realistic Phase Response Modeling

EagleVision: A Multi-Task Benchmark for Cross-Domain Perception in High-Speed Autonomous Racing

Minimal Embodiment Enables Efficient Learning of Number Concepts in Robot

CLASP: Closed-loop Asynchronous Spatial Perception for Open-vocabulary Desktop Object Grasping

Learning to Forget -- Hierarchical Episodic Memory for Lifelong Robot Deployment

Toward Environment-Aware LAE: SAR as a Shared Sensing Infrastructure

ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation

AIM: Intent-Aware Unified world action Modeling with Spatial Value Maps

Browse by Category

Research Type

Publish Your Research