Expertini Research Research

Browse Research Papers

14,737+ open-access research outputs.

โœ• Clear
๐Ÿ” visual perception ๐Ÿ“‚ Engineering
Showing 14737 results for "visual perception" in Engineering
Engineering Preprint PDF DOI

The False Resonance: A Critical Examination of Emotion Embedding Similarity for Speech Generation Evaluation

Yun-Shao Tsai, Yi-Cheng Lin, Huang-Cheng Chou, Tzu-Wen Hsu, Yun-Man Hsu, Chun Wei Chen, Shrikanth Narayanan, Hung-yi Lee ยท 2026

Objective metrics for emotional expressiveness are vital for speech generation, particularly in expressive synthesis and voice conversion requiring emotional prosody transfer. To quantify this, the fiโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Optimizing Tracking Accuracy in Energy-Constrained Multimodal ISAC via Lyapunov-Driven Heterogeneous Mixture-of-Experts

Wenqi Fan, Ning Wei, Ahmad Bazzi, Rongyan Xi, Zhixian Song, You Li, Zhihan Zeng, Yue Xiu, Chadi Assi ยท 2026

The integration of multimodal sensing and millimeter-wave (mmWave) communications is a key enabler for highly mobile vehicle-to-infrastructure (V2I) networks. However, continuous high-resolution visuaโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Robot Planning and Situation Handling with Active Perception

Austine Oloo, Zainab Altaweel, Yohei Hayamizu, Peiqi Liu, Yan Ding, Saeid Amiri, Hao Yang, Andy Kaminski, Chad Esselink, Chris Paxton, Xiaohan Zhang, Shiqi Zhang ยท 2026

Current robots are capable of computing plans to accomplish complex tasks. However, real-world environments are inherently open and dynamic, and unforeseen situations frequently arise during plan execโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Privileged Foresight Distillation: Zero-Cost Future Correction for World Action Models

Pengcheng Fang, Hongli Chen, Xiaohao Cai ยท 2026

World action models jointly predict future video and action during training, raising an open question about what role the future-prediction branch actually plays. A recent finding shows that this branโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning

Yixuan Huang, Bowen Li, Vaibhav Saxena, Yichao Liang, Utkarsh Aashu Mishra, Liang Ji, Lihan Zha, Jimmy Wu, Nishanth Kumar, Sebastian Scherer, Danfei Xu, Tom Silver ยท 2026

Robotic systems that interact with the physical world must reason about kinematic and dynamic constraints imposed by their own embodiment, their environment, and the task at hand. We introduce KinDER,โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning

Yufei Jia, Heng Zhang, Ziheng Zhang, Junzhe Wu, Mingrui Yu, Zifan Wang, Dixuan Jiang, Zheng Li, Chenyu Cao, Zhuoyuan Yu, Xun Yang, Haizhou Ge, Yuchi Zhang, Jiayuan Zhang, Zhenbiao Huang, Tianle Liu, Shenyu Chen, Jiacheng Wang, Bin Xie, Xuran Yao, Xiwa Deng, Guangyu Wang, Jinzhi Zhang, Lei Hao, Zhixing Chen, Yuxiang Chen, Anqi Wang, Hongyun Tian, Yiyi Yan, Zhanxiang Cao, Yizhou Jiang, Hanyang Shao, Yue Li, Lu Shi, Bokui Chen, Wei Sui, Hanqing Cui, Yusen Qin, Ruqi Huang, Lei Han, Tiancai Wang, Guyue Zhou ยท 2026

Embodied AI research is undergoing a shift toward vision-centric perceptual paradigms. While massively parallel simulators have catalyzed breakthroughs in proprioception-based locomotion, their potentโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

ANCHOR: A Physically Grounded Closed-Loop Framework for Robust Home-Service Mobile Manipulation

Jinhao Jiang, Shengyu Fang, Sibo Zuo, Yujie Tang, Yirui Li ยท 2026

Recent advances in open-vocabulary mobile manipulation have brought robots into real domestic environments. In such settings, reliable long-horizon execution under open-set object references and frequโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Cross-Linguistic Rhythmic and Spectral Feature-Based Analysis of Nyishi and Adi: Two Under-Resourced Languages of Arunachal Pradesh

Deepshikha Gogoi, Parismita Gogoi, Yang Saring ยท 2026

Under-resourced languages remain underrepresented in quantitative rhythm research,particularly in systematic intra-branch analysis of acoustic differentiation within closely related linguistic groups.โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

TEACar: An Open-Source Autonomous Driving Platform

Zhongzheng Zhang, Maxwell Ruyle, Andrew Kappes, Tyler Ruble, William Shaoul, Dana Moreno, Jack Penn, Ivan Ruchkin ยท 2026

Intelligent Transportation Systems (ITS) increasingly rely on vision-based perception and learning-based control, necessitating experimental platforms that support realistic hardware-in-the-loop validโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Libra-VLA: Achieving Learning Equilibrium via Asynchronous Coarse-to-Fine Dual-System

Yifei Wei, Linqing Zhong, Yi Liu, Yuxiang Lu, Xindong He, Maoqing Yao, Guanghui Ren ยท 2026

Vision-Language-Action (VLA) models are a promising paradigm for generalist robotic manipulation by grounding high-level semantic instructions into executable physical actions. However, prevailing appโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

An analysis of sensor selection for fruit picking with suction-based grippers

Eva Krueger, Marcus Rosette, Joseph R. Davidson ยท 2026

Robotic fruit harvesting often fails to reliably detect whether a fruit has been successfully picked, limiting efficiency and increasing crop damage. This problem is difficult due to compliant fruit aโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

VISION-SLS: Safe Perception-Based Control from Learned Visual Representations via System Level Synthesis

Antoine P. Leeman, Shuyu Zhan, Melanie N. Zeilinger, Glen Chou ยท 2026

We propose VISION-SLS, a method for nonlinear output-feedback control from high-resolution RGB images which provides robust constraint satisfaction guarantees under calibrated uncertainty bounds despiโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Passage-Aware Structural Mapping for RGB-D Visual SLAM

Ali Tourani, Miguel Fernandez-Cortizas, Saad Ejaz, David Perez Saura, Asier Bikandi-Noya, Jose Luis Sanchez-Lopez, Holger Voos ยท 2026

Doorways and passages are critical structural elements for indoor robot navigation, yet they remain underexplored in modern Visual SLAM (VSLAM) frameworks. This paper presents a passage-aware structurโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Agent-Centric Visual Reinforcement Learning under Dynamic Perturbations

Zhengru Fang, Yu Guo, Fei Liu, Yuang Zhang, Yihang Tao, Senkang Hu, Wenbo Ding, Yuguang Fang ยท 2026

Visual reinforcement learning aims to empower an agent to learn policies from visual observations, yet it remains vulnerable to dynamic visual perturbations, such as unpredictable shifts in corruptionโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

FreqCache: Accelerating Embodied VLN Models with Adaptive Frequency-Guided Token Caching

Zihao Zheng, Xingyue Zhou, Zhihao Mao, Songyu Sun, Lingyue Zhang, Yulong Ao, Yupu Feng, Qiongqiong Zhang, Yonghua Lin, Xiang Chen ยท 2026

Vision-Language-Navigation (VLN) models exhibit excellent navigation accuracy but incur high computational overhead. Token caching has emerged as a promising training-free strategy to reduce this costโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Deep Learning-Enabled Dissolved Oxygen Sensing in Biofouling Environments for Ocean Monitoring

Nikolaos Salaris, Adrien Desjardins, Manish K. Tiwari ยท 2026

The escalating climate crisis and ecosystem degradation demand intelligent, low-cost sensors capable of robust, long-term monitoring in real-world environments. Absolute dissolved oxygen (DO) concentrโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation

Kai Yang, Zedong Chu, Yingnan Guo, Zhengbo Wang, Shichao Xie, Yanfen Shen, Xiaolong Wu, Xing Li, Mu Xu ยท 2026

While Vision-Language-Action (VLA) models have been demonstrated possessing strong zero-shot generalization for robot control, their massive parameter sizes typically necessitate cloud-based deploymenโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Event-based SLAM Benchmark for High-Speed Maneuvers

Sheng Zhong, Junkai Niu, Guillermo Gallego, Kaizhen Sun, Yang Yi, Zhiqiang Miao, Dewen Hu, Yaonan Wang, Davide Scaramuzza, Yi Zhou ยท 2026

Event-based cameras are bio-inspired sensors with pixels that independently and asynchronously respond to brightness changes at microsecond resolution, offering the potential to handle visual tasks inโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

VLM-VPI: A Vision-Language Reasoning Framework for Improving Automated Vehicle-Pedestrian Interactions

Qingwen Pu, Kun Xie, Yuxiang Liu ยท 2026

Autonomous driving systems often infer pedestrian yielding behavior from geometric and kinematic cues alone, limiting their ability to reason about visual scene context and age-dependent behavioral vaโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Decentralized Heterogeneous Multi-Robot Collaborative Exploration for Indoor and Outdoor 3D Environments

Yuxiang Li, Kun Chen, Jiancheng Wang, Shihao Fang, Haoyao Chen, Yunhui Liu ยท 2026

Heterogeneous multi-robot systems feature significant adaptability for complex environments. However, effective collaboration that fully exploits the robots' potential remains a core challenge. This pโ€ฆ

Read Paper โ†’
โ† Prev Page 2 of 737 Next โ†’