Expertini Research Research

Browse Research Papers

14,737+ open-access research outputs.

โœ• Clear
๐Ÿ” visual perception ๐Ÿ“‚ Engineering
Showing 14737 results for "visual perception" in Engineering
Engineering Preprint PDF DOI

Mask World Model: Predicting What Matters for Robust Robot Policy Learning

Yunfan Lou, Xiaowei Chi, Xiaojie Zhang, Zezhong Qian, Chengxuan Li, Rongyu Zhang, Yaoxu Lyu, Guoyu Song, Chuyao Fu, Haoxuan Xu, Pengwei Wang, Shanghang Zhang ยท 2026

World models derived from large-scale video generative pre-training have emerged as a promising paradigm for generalist robot policy learning. However, standard approaches often focus on high-fidelityโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

A Gesture-Based Visual Learning Model for Acoustophoretic Interactions using a Swarm of AcoustoBots

Alex Lin, Lei Gao, Narsimlu Kemsaram, Sriram Subramanian ยท 2026

AcoustoBots are mobile acoustophoretic robots capable of delivering mid-air haptics, directional audio, and acoustic levitation, but existing implementations rely on scripted commands and lack an intuโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Autonomous UAV Pipeline Near-proximity Inspection via Disturbance-Aware Predictive Visual Servoing

Wen Li, Hui Wang, Jinya Su, Cunjia Liu, Wen-Hua Chen, Shihua Li ยท 2026

Reliable pipeline inspection is critical to safe energy transportation, but is constrained by long distances, complex terrain, and risks to human inspectors. Unmanned aerial vehicles provide a flexiblโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

GenerativeMPC: VLM-RAG-guided Whole-Body MPC with Virtual Impedance for Bimanual Mobile Manipulation

Marcelino Julio Fernando, Miguel Altamirano Cabrera, Jeffrin Sam, Yara Mahmoud, Konstantin Gubernatorov, Dzmitry Tsetserukou ยท 2026

Bimanual mobile manipulation requires a seamless integration between high-level semantic reasoning and safe, compliant physical interaction - a challenge that end-to-end models approach opaquely and cโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Achieving Interaction Fluidity in a Wizard-of-Oz Robotic System: A Prototype for Fluid Error-Correction

Carlos Baptista De Lima, Julian Hough, Frank Forster, Patrick Holthaus, Yongjun Zheng ยท 2026

Achieving truly fluid interaction with robots with speech interfaces remains a hard problem, and the experience of current Human-Robot Interaction (HRI) remains laboured and frustrating. Some of the bโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Quadruped Parkour Learning: Sparsely Gated Mixture of Experts with Visual Input

Michael Ziegltrum, Jianhao Jiao, Tianhu Peng, Chengxu Zhou, Dimitrios Kanoulas ยท 2026

Robotic parkour provides a compelling benchmark for advancing locomotion over highly challenging terrain, including large discontinuities such as elevated steps. Recent approaches have demonstrated imโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Warmth and Competence in the Swarm: Designing Effective Human-Robot Teams

Genki Miyauchi, Roderich Gro{ss}, Chaona Chen ยท 2026

As groups of robots increasingly collaborate with humans, understanding how humans perceive them is critical for designing effective human-robot teams. While prior research examined how humans interprโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation

Feng Jiang, Yang Chen, Kyle Xu, Yuchen Liu, Haifeng Wang, Zhenhao Shen, Jasper Lu, Shengze Huang, Yuanfei Wang, Chen Xie, Ruihai Wu ยท 2026

Recent advances in large-scale video world models have enabled increasingly realistic future prediction, raising the prospect of leveraging imagined videos for robot learning. However, visual realism โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

AeroBridge-TTA: Test-Time Adaptive Language-Conditioned Control for UAVs

Lingxue Lyu ยท 2026

Language-guided unmanned aerial vehicles (UAVs) often fail not from bad reasoning or perception, but from execution mismatch: the gap between a planned trajectory and the controller's ability to traโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

RoomRecon: High-Quality Textured Room Layout Reconstruction on Mobile Devices

Seok Joon Kim, Dinh Duc Cao, Federica Spinola, Se Jin Lee, Kyu Sung Cho ยท 2026

Widespread RGB-Depth (RGB-D) sensors and advanced 3D reconstruction technologies facilitate the capture of indoor spaces, improving the fields of augmented reality (AR), virtual reality (VR), and exteโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Inertia Matching Principle: Improving Transient Synchronization Stability in Hybrid Power Systems With VSGs and SGs

Changjun He, Li Zhang, Qi Liu, Rui Zou ยท 2026

This paper investigates the transient synchronization stability in power systems hybridized with virtual synchronous generators (VSGs) and synchronous generators (SGs). A relative swing equation modelโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

AI-Enabled Image-Based Hybrid Vision/Force Control of Tendon-Driven Aerial Continuum Manipulators

Shayan Sepahvand, Farrokh Janabi-Sharifi, Farhad Aghili ยท 2026

This paper presents an AI-enabled cascaded hybrid vision/force control framework for tendon-driven aerial continuum manipulators based on constant-strain modeling in $SE(3)$ as a coupled system. The pโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Hybrid SMI Realization via Matrix Completion and Riemannian Manifold Optimization on Narrowband Sub-Array Based Architectures

Tarun Suman Cousik, Rohit Rangaraj, Nishith Tripathi, Jeffrey H Reed, Daniel Jakubisin, Jon Kraft ยท 2026

Hybrid beamforming architectures reduce hardware complexity but restrict access to full array observations, rendering direct implementation of classical covariance based methods such as minimum varianโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

A Controlled Benchmark of Visual State-Space Backbones with Domain-Shift and Boundary Analysis for Remote-Sensing Segmentation

Nichula Wasalathilaka, Dineth Perera, Oshadha Samarakoon, Buddhi Wijenayake, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake ยท 2026

Visual state-space models (SSMs) are increasingly promoted as efficient alternatives to Vision Transformers, yet their practical advantages remain unclear under fair comparison because existing studieโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

EmbodiedLGR: Integrating Lightweight Graph Representation and Retrieval for Semantic-Spatial Memory in Robotic Agents

Paolo Riva, Leonardo Gargani, Matteo Frosi, Matteo Matteucci ยท 2026

As the world of agentic artificial intelligence applied to robotics evolves, the need for agents capable of building and retrieving memories and observations efficiently is increasing. Robots operatinโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Leader-Follower Formation Control Using Differential Drag and Effective Surface Regulation

Alessio Bocci, Jose Juan Corona-Sanchez, Raymond Kristiansen ยท 2026

The growing interest in space activities has led to the emergence of new space operators and innovative mission concepts. Small satellites such as CubeSats reduce mission costs and are typically deploโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

SpaceDex: Generalizable Dexterous Grasping in Tiered Workspaces

Wensheng Wang, Chuanjun Guo, Wei Wei, Tong Wu, Ning Tan ยท 2026

Generalizable grasping with high-degree-of-freedom (DoF) dexterous hands remains challenging in tiered workspaces, where occlusion, narrow clearances, and height-dependent constraints are substantiallโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

StableIDM: Stabilizing Inverse Dynamics Model against Manipulator Truncation via Spatio-Temporal Refinement

Kerui Li, Zhe Jing, Xiaofeng Wang, Zheng Zhu, Yukun Zhou, Guan Huang, Dongze Li, Qingkai Yang, Huaibo Huang ยท 2026

Inverse Dynamics Models (IDMs) map visual observations to low-level action commands, serving as central components for data labeling and policy execution in embodied AI. However, their performance degโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

ST-$\pi$: Structured SpatioTemporal VLA for Robotic Manipulation

Chuanhao Ma, Hanyu Zhou, Shihan Peng, Yan Li, Tao Gu, Luxin Yan ยท 2026

Vision-language-action (VLA) models have achieved great success on general robotic tasks, but still face challenges in fine-grained spatiotemporal manipulation. Typically, existing methods mainly embeโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

SYMBOLIZER: Symbolic Model-free Task Planning with VLMs

Sami Azirar, Zlatan Ajanovic, Hermann Blum ยท 2026

Traditional Task and Motion Planning (TAMP) systems depend on physics models for motion planning and discrete symbolic models for task planning. Although physics model are often available, symbolic moโ€ฆ

Read Paper โ†’
โ† Prev Page 5 of 737 Next โ†’