Expertini Research Research

Browse Research Papers

14,737+ open-access research outputs.

โœ• Clear
๐Ÿ” visual perception ๐Ÿ“‚ Engineering
Showing 14737 results for "visual perception" in Engineering
Engineering Preprint PDF DOI

RACF: A Resilient Autonomous Car Framework with Object Distance Correction

Chieh Tsai, Hossein Rastgoftar, Salim Hariri ยท 2026

Autonomous vehicles are increasingly deployed in safety-critical applications, where sensing failures or cyberphysical attacks can lead to unsafe operations resulting in human loss and/or severe physiโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Unveiling the Surprising Efficacy of Navigation Understanding in End-to-End Autonomous Driving

Zhihua Hua, Junli Wang, Pengfei LI, Qihao Jin, Bo Zhang, Kehua Sheng, Yilun Chen, Zhongxue Gan, Wenchao Ding ยท 2026

Global navigation information and local scene understanding are two crucial components of autonomous driving systems. However, our experimental results indicate that many end-to-end autonomous drivingโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Robotic Nanoparticle Synthesis via Solution-based Processes

Dasharadhan Mahalingam, Michael Gallagher, Nilanjan Chakraborty, Stanislaus S. Wong ยท 2026

We present a screw geometry-based manipulation planning framework for the robotic automation of solution-based synthesis, exemplified through the preparation of gold and magnetite nanoparticles. The sโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Why Your Tokenizer Fails in Information Fusion: A Timing-Aware Pre-Quantization Fusion for Video-Enhanced Audio Tokenization

Xiangyu Zhang, Benjamin John Southwell, Siqi Pan, Xinlei Niu, Beena Ahmed, Julien Epps ยท 2026

Audio tokenization has emerged as a critical component in end-to-end audio language models, enabling efficient discrete representation learning for both audio understanding and generation tasks. Howevโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Parametric Interpolation of Dynamic Mode Decomposition for Predicting Nonlinear Systems

Ananda Chakrabarti, Haitham H. Saleh, Indranil Nayak, Balasubramaniam Shanker, Fernando L. Teixeira, Debdipta Goswami ยท 2026

We present parameter-interpolated dynamic mode decomposition (piDMD), a parametric reduced-order modeling framework that embeds known parameter-affine structure directly into the DMD regression step. โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

3DRO: Lidar-level SE(3) Direct Radar Odometry Using a 2D Imaging Radar and a Gyroscope

Cedric Le Gentil, Daniil Lisus, Timothy D. Barfoot ยท 2026

Recently, the robotics community has regained interest in radar-based perception and state estimation. A 2D imaging radar provides dense 360deg information about the environment. Despite the radar antโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

ReefMapGS: Enabling Large-Scale Underwater Reconstruction by Closing the Loop Between Multimodal SLAM and Gaussian Splatting

Daniel Yang, Jungseok Hong, John J. Leonard, Yogesh Girdhar ยท 2026

3D Gaussian Splatting is a powerful visual representation, providing high-quality and efficient 3D scene reconstruction, but it is crucially dependent on accurate camera poses typically obtained from โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

M2HRI: An LLM-Driven Multimodal Multi-Agent Framework for Personalized Human-Robot Interaction

Shaid Hasan, Breenice Lee, Sujan Sarker, Tariq Iqbal ยท 2026

Multi-robot systems hold significant promise for social environments such as homes and hospitals, yet existing multi-robot works treat robots as functionally identical, overlooking how robots individuโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Grounded World Model for Semantically Generalizable Planning

Quanyi Li, Lan Feng, Haonan Zhang, Wuyang Li, Letian Wang, Alexandre Alahi, Harold Soh ยท 2026

In Model Predictive Control (MPC), world models predict the future outcomes of various action proposals, which are then scored to guide the selection of the optimal action. For visuomotor MPC, the scoโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Dual-Control Frequency-Aware Diffusion Model for Depth-Dependent Optical Microrobot Microscopy Image Generation

Lan Wei, Zongcai Tan, Kangyi Lu, Jian-Qing Zheng, Dandan Zhang ยท 2026

Optical microrobots actuated by optical tweezers (OT) are important for cell manipulation and microscale assembly, but their autonomous operation depends on accurate 3D perception. Developing such perโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

VLMaterial: Vision-Language Model-Based Camera-Radar Fusion for Physics-Grounded Material Identification

Jiangyou Zhu, He Chen ยท 2026

Accurate material recognition is a fundamental capability for intelligent perception systems to interact safely and effectively with the physical world. For instance, distinguishing visually similar oโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Micro-Dexterity in Biological Micromanipulation: Embodiment, Perception, and Control

Kangyi Lu, Lan Wei, Zongcai Tan, Dandan Zhang ยท 2026

Microscale manipulation has advanced substantially in controlled locomotion and targeted transport, yet many biomedical applications require precise and adaptive interaction with biological micro-objeโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Wideband Sensing with Dynamic Metasurface Antennas under Realistic Phase Response Modeling

Ioannis Gavras, George C. Alexandropoulos ยท 2026

This paper investigates the impact of practical features of the emerging antenna array technology of Dynamic Metasurface Antennas (DMAs) when used for wideband sensing. By adopting a realistic DMA resโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

EagleVision: A Multi-Task Benchmark for Cross-Domain Perception in High-Speed Autonomous Racing

Zakhar Yagudin, Murad Mebrahtu, Ren Jin, Jiaqi Huang, Yujia Yue, Dzmitry Tsetserukou, Jorge Dias, Majid Khonji ยท 2026

High-speed autonomous racing presents extreme perception challenges, including large relative velocities and substantial domain shifts from conventional urban-driving datasets. Existing benchmarks do โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Minimal Embodiment Enables Efficient Learning of Number Concepts in Robot

Zhegong Shangguan, Alessandro Di Nuovo, Angelo Cangelosi ยท 2026

Robots are increasingly entering human-interactive scenarios that require understanding of quantity. How intelligent systems acquire abstract numerical concepts from sensorimotor experience remains a โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

CLASP: Closed-loop Asynchronous Spatial Perception for Open-vocabulary Desktop Object Grasping

Yiran Ling, Wenxuan Li, Siying Dong, Yize Zhang, Xiaoyao Huang, Jing Jiang, Ruonan Li, Jie Liu ยท 2026

Robot grasping of desktop object is widely used in intelligent manufacturing, logistics, and agriculture.Although vision-language models (VLMs) show strong potential for robotic manipulation, their deโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Learning to Forget -- Hierarchical Episodic Memory for Lifelong Robot Deployment

Leonard Barmann, Joana Plewnia, Alex Waibel, Tamim Asfour ยท 2026

Robots must verbalize their past experiences when users ask "Where did you put my keys?" or "Why did the task fail?" Yet maintaining life-long episodic memory (EM) from continuous multimodal perceptioโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Toward Environment-Aware LAE: SAR as a Shared Sensing Infrastructure

Xue Zhang, Bang Huang, Mohamed-Slim Alouini ยท 2026

The rapid growth of the low-altitude economy (LAE) is making aerial systems an important part of future digital infrastructure. Although major advances have been achieved in unmanned aerial vehicle (Uโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation

Arjun Bhardwaj, Maximum Wilder-Smith, Mayank Mittal, Vaishakh Patil, Marco Hutter ยท 2026

In-hand object reorientation requires precise estimation of the object pose to handle complex task dynamics. While RGB sensing offers rich semantic cues for pose tracking, existing solutions rely on mโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

AIM: Intent-Aware Unified world action Modeling with Spatial Value Maps

Liaoyuan Fan, Zetian Xu, Chen Cao, Wenyao Zhang, Mingqi Yuan, Jiayu Chen ยท 2026

Pretrained video generation models provide strong priors for robot control, but existing unified world action models still struggle to decode reliable actions without substantial robot-specific trainiโ€ฆ

Read Paper โ†’
โ† Prev Page 9 of 737 Next โ†’