Programming Languages in Engineering — Research Repository

Engineering Preprint PDF DOI

Building Explicit World Model for Zero-Shot Open-World Object Manipulation

Xiaotong Li, Gang Chen, Javier Alonso-Mora · 2026

Open-world object manipulation remains a fundamental challenge in robotics. While Vision-Language-Action (VLA) models have demonstrated promising results, they rely heavily on large-scale robot action…

Read Paper →

Engineering Preprint PDF DOI

ST-VLA: Enabling 4D-Aware Spatiotemporal Understanding for General Robot Manipulation

You Wu, Zixuan Chen, Cunxu Ou, Wenxuan Wang, Wenbo Huang, Lin Cao, Yangtao Chen, Weichao Qiu, Xingyue Quan, Jieqi Shi, Jing Huo, Yang Gao · 2026

Robotic manipulation in open-world environments requires reasoning across semantics, geometry, and long-horizon action dynamics. Existing hierarchical Vision-Language-Action (VLA) frameworks typically…

Read Paper →

Engineering Preprint PDF DOI

Your Vision-Language-Action Model Already Has Attention Heads For Path Deviation Detection

Jaehwan Jeong, Evelyn Zhu, Jinying Lin, Emmanuel Jaimes, Tuan-Anh Vu, Jungseock Joo, Sangpil Kim, M. Khalid Jawed · 2026

Vision-Language-Action (VLA) models have demonstrated strong potential for predicting semantic actions in navigation tasks, demonstrating the ability to reason over complex linguistic instructions and…

Read Paper →

Engineering Preprint PDF DOI

SAATT Nav: a Socially Aware Autonomous Transparent Transportation Navigation Framework for Wheelchairs

Yutong Zhang, Shaiv Y. Mehra, Bradley S. Duerstock, Juan P. Wachs · 2026

While powered wheelchairs reduce physical fatigue as opposed to manual wheelchairs for individuals with mobility impairment, they demand high cognitive workload due to information processing, decision…

Read Paper →

Engineering Preprint PDF DOI

Learning Actionable Manipulation Recovery via Counterfactual Failure Synthesis

Dayou Li, Jiuzhou Lei, Hao Wang, Lulin Liu, Yunhao Yang, Zihan Wang, Bangya Liu, Minghui Zheng, Zhiwen Fan · 2026

While recent foundation models have significantly advanced robotic manipulation, these systems still struggle to autonomously recover from execution errors. Current failure-learning paradigms rely on …

Read Paper →

Engineering Preprint PDF DOI

ADIOSS Automatic Diagnostic Of System Simulations

Di Jiang, Sebastian Rodriguez, Herve Colin, Yves Tourbier, Francisco Chinesta · 2026

Automotive engineering makes extensive use of numerical simulation throughout the design process. The development of numerical models, their validation against experimental tests, and their updating d…

Read Paper →

Engineering Preprint PDF DOI

DecoVLN: Decoupling Observation, Reasoning, and Correction for Vision-and-Language Navigation

Zihao Xin, Wentong Li, Yixuan Jiang, Bin Wang, Runmin Cong, Jie Qin, Shengjun Huang · 2026

Vision-and-Language Navigation (VLN) requires agents to follow long-horizon instructions and navigate complex 3D environments. However, existing approaches face two major challenges: constructing an e…

Read Paper →

Engineering Preprint PDF DOI

Evaluating VLMs' Spatial Reasoning Over Robot Motion: A Step Towards Robot Planning with Motion Preferences

Wenxi Wu, Jingjing Zhang, Martim Brandao · 2026

Understanding user instructions and object spatial relations in surrounding environments is crucial for intelligent robot systems to assist humans in various tasks. The natural language and spatial re…

Read Paper →

Engineering Preprint PDF DOI

SldprtNet: A Large-Scale Multimodal Dataset for CAD Generation in Language-Driven 3D Design

Ruogu Li, Sikai Li, Yao Mu, Mingyu Ding · 2026

We introduce SldprtNet, a large-scale dataset comprising over 242,000 industrial parts, designed for semantic-driven CAD modeling, geometric deep learning, and the training and fine-tuning of multimod…

Read Paper →

Engineering Preprint PDF DOI

Language-Grounded Decoupled Action Representation for Robotic Manipulation

Wuding Weng, Tongshu Wu, Liucheng Chen, Siyu Xie, Zheng Wang, Xing Xu, Jingkuan Song, Heng Tao Shen · 2026

The heterogeneity between high-level vision-language understanding and low-level action control remains a fundamental challenge in robotic manipulation. Although recent methods have advanced task-spec…

Read Paper →

Engineering Preprint PDF DOI

ReMem-VLA: Empowering Vision-Language-Action Model with Memory via Dual-Level Recurrent Queries

Hang Li, Fengyi Shen, Dong Chen, Liudi Yang, Xudong Wang, Jinkui Shi, Zhenshan Bing, Ziyuan Liu, Alois Knoll · 2026

Vision-language-action (VLA) models for closed-loop robot control are typically cast under the Markov assumption, making them prone to errors on tasks requiring historical context. To incorporate memo…

Read Paper →

Engineering Preprint PDF DOI

RoboStream: Weaving Spatio-Temporal Reasoning with Memory in Vision-Language Models for Robotics

Yuzhi Huang, Jie Wu, Weijue Bu, Ziyi Xiong, Gaoyang Jiang, Ye Li, Kangye Ji, Shuzhao Xie, Yue Huang, Chenglei Wu, Jingyan Jiang, Zhi Wang · 2026

Enabling reliable long-horizon robotic manipulation is a crucial step toward open-world embodied intelligence. However, VLM-based planners treat each step as an isolated observation-to-action mapping,…

Read Paper →

Engineering Preprint PDF DOI

MotionAnymesh: Physics-Grounded Articulation for Simulation-Ready Digital Twins

WenBo Xu, Liu Liu, Li Zhang, Dan Guo, RuoNan Liu · 2026

Converting static 3D meshes into interactable articulated assets is crucial for embodied AI and robotic simulation. However, existing zero-shot pipelines struggle with complex assets due to a critical…

Read Paper →

Engineering Preprint PDF DOI

AnchorVLA4D: an Anchor-Based Spatial-Temporal Vision-Language-Action Model for Robotic Manipulation

Juan Zhu, Zhanying Shao, Xiaoqi Li, Ethan Morgan, Jiadong Xu, Hongwei Fan, Hao Dong · 2026

Since current Vision-Language-Action (VLA) systems suffer from limited spatial perception and the absence of memory throughout manipulation, we investigate visual anchors as a means to enhance spatial…

Read Paper →

Engineering Preprint PDF DOI

Altered Thoughts, Altered Actions: Probing Chain-of-Thought Vulnerabilities in VLA Robotic Manipulation

Tuan Duong Trinh, Naveed Akhtar, Basim Azam · 2026

Recent Vision-Language-Action (VLA) models increasingly adopt chain-of-thought (CoT) reasoning, generating a natural-language plan before decoding motor commands. This internal text channel between th…

Read Paper →

Engineering Preprint PDF DOI

HaltNav: Reactive Visual Halting over Lightweight Topological Priors for Robust Vision-Language Navigation

Zihui Yu, Pingcong Li, Bichi Zhang, Soren Schwertfeger · 2026

Vision-and-Language Navigation (VLN) is shifting from rigid, step-by-step instruction following toward open-vocabulary, goal-oriented autonomy. Achieving this transition without exhaustive routing pro…

Read Paper →

Engineering Preprint PDF DOI

Spatially Grounded Long-Horizon Task Planning in the Wild

Sehun Jung, HyunJee Song, Dong-Hee Kim, Reuben Tan, Jianfeng Gao, Yong Jae Lee, Donghyun Kim · 2026

Recent advances in robot manipulation increasingly leverage Vision-Language Models (VLMs) for high-level reasoning, such as decomposing task instructions into sequential action plans expressed in natu…

Read Paper →

Engineering Preprint PDF DOI

TacVLA: Contact-Aware Tactile Fusion for Robust Vision-Language-Action Manipulation

Kaidi Zhang, Heng Zhang, Zhengtong Xu, Zhiyuan Zhang, Md Rakibul Islam Prince, Xiang Li, Xiaojing Han, Yuhao Zhou, Arash Ajoudani, Yu She · 2026

Vision-Language-Action (VLA) models have demonstrated significant advantages in robotic manipulation. However, their reliance on vision and language often leads to suboptimal performance in tasks invo…

Read Paper →

Engineering Preprint PDF DOI

From Woofs to Words: Towards Intelligent Robotic Guide Dogs with Verbal Communication

Yohei Hayamizu, David DeFazio, Hrudayangam Mehta, Zainab Altaweel, Jacqueline Choe, Chao Lin, Jake Juettner, Furui Xiao, Jeremy Blackburn, Shiqi Zhang · 2026

Assistive robotics is an important subarea of robotics that focuses on the well-being of people with disabilities. A robotic guide dog is an assistive quadruped robot that helps visually impaired peop…

Read Paper →

Engineering Preprint PDF DOI

Beyond Dense Futures: World Models as Structured Planners for Robotic Manipulation

Minghao Jin, Mozheng Liao, Mingfei Han, Zhihui Li, Xiaojun Chang · 2026

Recent world-model-based Vision-Language-Action (VLA) architectures have improved robotic manipulation through predictive visual foresight. However, dense future prediction introduces visual redundanc…

Read Paper →

Browse Research Papers

Building Explicit World Model for Zero-Shot Open-World Object Manipulation

ST-VLA: Enabling 4D-Aware Spatiotemporal Understanding for General Robot Manipulation

Your Vision-Language-Action Model Already Has Attention Heads For Path Deviation Detection

SAATT Nav: a Socially Aware Autonomous Transparent Transportation Navigation Framework for Wheelchairs

Learning Actionable Manipulation Recovery via Counterfactual Failure Synthesis

ADIOSS Automatic Diagnostic Of System Simulations

DecoVLN: Decoupling Observation, Reasoning, and Correction for Vision-and-Language Navigation

Evaluating VLMs' Spatial Reasoning Over Robot Motion: A Step Towards Robot Planning with Motion Preferences

SldprtNet: A Large-Scale Multimodal Dataset for CAD Generation in Language-Driven 3D Design

Language-Grounded Decoupled Action Representation for Robotic Manipulation

ReMem-VLA: Empowering Vision-Language-Action Model with Memory via Dual-Level Recurrent Queries

RoboStream: Weaving Spatio-Temporal Reasoning with Memory in Vision-Language Models for Robotics

MotionAnymesh: Physics-Grounded Articulation for Simulation-Ready Digital Twins

AnchorVLA4D: an Anchor-Based Spatial-Temporal Vision-Language-Action Model for Robotic Manipulation

Altered Thoughts, Altered Actions: Probing Chain-of-Thought Vulnerabilities in VLA Robotic Manipulation

HaltNav: Reactive Visual Halting over Lightweight Topological Priors for Robust Vision-Language Navigation

Spatially Grounded Long-Horizon Task Planning in the Wild

TacVLA: Contact-Aware Tactile Fusion for Robust Vision-Language-Action Manipulation

From Woofs to Words: Towards Intelligent Robotic Guide Dogs with Verbal Communication

Beyond Dense Futures: World Models as Structured Planners for Robotic Manipulation

Browse by Category

Research Type

Publish Your Research