Programming Languages in Engineering — Research Repository

Engineering Preprint PDF DOI

Skill-Evolving Grounded Reasoning for Free-Text Promptable 3D Medical Image Segmentation

Tongrui Zhang, Chenhui Wang, Yongming Li, Zhihao Chen, Xufeng Zhan, Hongming Shan · 2026

Free-text promptable 3D medical image segmentation offers an intuitive and clinically flexible interaction paradigm. However, current methods are highly sensitive to linguistic variability: minor chan…

Read Paper →

Engineering Preprint PDF DOI

UniGround: Universal 3D Visual Grounding via Training-Free Scene Parsing

Jiaxi Zhang, Yunheng Wang, Wei Lu, Taowen Wang, Weisheng Xu, Shuning Zhang, Yixiao Feng, Yuetong Fang, Renjing Xu · 2026

Understanding and localizing objects in complex 3D environments from natural language descriptions, known as 3D Visual Grounding (3DVG), is a foundational challenge in embodied AI, with broad implicat…

Read Paper →

Engineering Preprint PDF DOI

SaiVLA-0: Cerebrum--Pons--Cerebellum Tripartite Architecture for Compute-Aware Vision-Language-Action

Xiang Shi, Wenlong Huang, Menglin Zou, Xinhai Sun · 2026

We revisit Vision-Language-Action through a neuroscience-inspired triad. Biologically, the Cerebrum provides stable high-level multimodal priors and remains frozen; the Pons Adapter integrates these c…

Read Paper →

Engineering Preprint PDF DOI

Towards Human-Like Manipulation through RL-Augmented Teleoperation and Mixture-of-Dexterous-Experts VLA

Tutian Tang, Xingyu Ji, Wanli Xing, Ce Hao, Wenqiang Xu, Lin Shao, Cewu Lu, Qiaojun Yu, Jiangmiao Pang, Kaifeng Zhang · 2026

While Vision-Language-Action (VLA) models have demonstrated remarkable success in robotic manipulation, their application has largely been confined to low-degree-of-freedom end-effectors performing si…

Read Paper →

Engineering Preprint PDF DOI

Language-Invariant Multilingual Speaker Verification for the TidyVoice 2026 Challenge

Ze Li, Xiaoxiao Miao, Juan Liu, Ming Li · 2026

Multilingual speaker verification (SV) remains challenging due to limited cross-lingual data and language-dependent information in speaker embeddings. This paper presents a language-invariant multilin…

Read Paper →

Engineering Preprint PDF DOI

See and Switch: Vision-Based Branching for Interactive Robot-Skill Programming

Petr Vanc, Jan Kristof Behrens, Vaclav Hlavac, Karla Stepanova · 2026

Programming robots by demonstration (PbD) is an intuitive concept, but scaling it to real-world variability remains a challenge for most current teaching frameworks. Conditional task graphs are very e…

Read Paper →

Engineering Preprint PDF DOI

AffordGrasp: Cross-Modal Diffusion for Affordance-Aware Grasp Synthesis

Xiaofei Wu, Yi Zhang, Yumeng Liu, Yuexin Ma, Yujiao Shi, Xuming He · 2026

Generating human grasping poses that accurately reflect both object geometry and user-specified interaction semantics is essential for natural hand-object interactions in AR/VR and embodied AI. Howeve…

Read Paper →

Engineering Preprint PDF DOI

NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving

Ximeng Tao, Pardis Taghavi, Dimitar Filev, Reza Langari, Gaurav Pandey · 2026

Vision-language models (VLMs) have emerged as a promising direction for end-to-end autonomous driving (AD) by jointly modeling visual observations, driving context, and language-based reasoning. Howev…

Read Paper →

Engineering Preprint PDF DOI

RoboRouter: Training-Free Policy Routing for Robotic Manipulation

Yiteng Chen, Zhe Cao, Hongjia Ren, Chenjie Yang, Wenbo Li, Shiyi Wang, Yemin Wang, Li Zhang, Yanming Shao, Zhenjun Zhao, Huiping Zhuang, Qingyao Wu · 2026

Research on robotic manipulation has developed a diverse set of policy paradigms, including vision-language-action (VLA) models, vision-action (VA) policies, and code-based compositional approaches. C…

Read Paper →

Engineering Preprint PDF DOI

Viewpoint-Agnostic Grasp Pipeline using VLM and Partial Observations

Dilermando Almeida, Juliano Negri, Guilherme Lazzarini, Thiago H. Segreto, Ranulfo Bezerra, Ricardo V. Godoy, Marcelo Becker · 2026

Robust grasping in cluttered, unstructured environments remains challenging for mobile legged manipulators due to occlusions that lead to partial observations, unreliable depth estimates, and the need…

Read Paper →

Engineering Preprint PDF DOI

Relating Reinforcement Learning to Dynamic Programming-Based Planning

Filip V. Georgiev, Kalle G. Timperi, Basak Sakcak, Steven M. LaValle · 2026

This paper bridges some of the gap between optimal planning and reinforcement learning (RL), both of which share roots in dynamic programming applied to sequential decision making or optimal control. …

Read Paper →

Engineering Preprint PDF DOI

Reasoning Knowledge-Gap in Drone Planning via LLM-based Active Elicitation

Zeyu Fang, Beomyeol Yu, Cheng Liu, Zeyuan Yang, Rongqian Chen, Yuxin Lin, Mahdi Imani, Tian Lan · 2026

Human-AI joint planning in Unmanned Aerial Vehicles (UAVs) typically relies on control handover when facing environmental uncertainties, which is often inefficient and cognitively demanding for non-ex…

Read Paper →

Engineering Preprint PDF DOI

Uncertainty Mitigation and Intent Inference: A Dual-Mode Human-Machine Joint Planning System

Zeyu Fang, Yuxin Lin, Cheng Liu, Beomyeol Yu, Zeyuan Yang, Rongqian Chen, Taeyoung Lee, Mahdi Imani, Tian Lan · 2026

Effective human-robot collaboration in open-world environments requires joint planning under uncertain conditions. However, existing approaches often treat humans as passive supervisors, preventing au…

Read Paper →

Engineering Preprint PDF DOI

Temperature-Aware Scheduling of LLM Inference in Large-Scale Geo-Distributed Edge Data Centers with Distributed Optimization

Arash Khalatbarisoltani, Amin Mahmoudi, Jie Han, Muhammad Saeed, Wenxue Liu, Jinwen Li, Solmaz Kahourzade, Amirmehdi Yazdani, Xiaosong Hu · 2026

The environmental impact of Large Language Models (LLMs) on data centers hosting these models is becoming a significant concern. While many efforts have focused on reducing the substantial training ov…

Read Paper →

Engineering Preprint PDF DOI

AeroPlace-Flow: Language-Grounded Object Placement for Aerial Manipulators via Visual Foresight and Object Flow

Sarthak Mishra, Rishabh Dev Yadav, Naveen Nair, Wei Pan, Spandan Roy · 2026

Precise object placement remains underexplored in aerial manipulation, where most systems rely on predefined target coordinates and focus primarily on grasping and control. Specifying exact placement …

Read Paper →

Engineering Preprint PDF DOI

Secure and Robust Beamforming Design for STAR-RIS-aided MU-MIMO ISAC Systems

Rakesh Ranjan, Anshu Mukherjee, Manjesh K. Hanawal, Keshav Singh, Ioannis Krikidis · 2026

Simultaneous transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) offer a transformative approach for integrated sensing and communication (ISAC) systems, particularly for enhanc…

Read Paper →

Engineering Preprint PDF DOI

AtomicVLA: Unlocking the Potential of Atomic Skill Learning in Robots

Likui Zhang, Tao Tang, Zhihao Zhan, Xiuwei Chen, Zisheng Chen, Jianhua Han, Jiangtong Zhu, Pei Xu, Hang Xu, Hefeng Wu, Liang Lin, Xiaodan Liang · 2026

Recent advances in Visual-Language-Action (VLA) models have shown promising potential for robotic manipulation tasks. However, real-world robotic tasks often involve long-horizon, multi-step problem-s…

Read Paper →

Engineering Preprint PDF DOI

TempoFit: Plug-and-Play Layer-Wise Temporal KV Memory for Long-Horizon Vision-Language-Action Manipulation

Jun Sun, Boyu Yang, Jiahao Zhang, Ning Ma, Chencheng Wu, Siqing Zhang, Yiou Huang, Qiufeng Wang, Shan Liang, Yaran Chen · 2026

Pretrained Vision-Language-Action (VLA) policies have achieved strong single-step manipulation, but their inference remains largely memoryless, which is brittle in non-Markovian long-horizon settings …

Read Paper →

Engineering Preprint PDF DOI

ACCURATE: Arbitrary-shaped Continuum Reconstruction Under Robust Adaptive Two-view Estimation

Yaozhi Zhang, Shun Yu, Yugang Zhang, Yang Liu · 2026

Accurate reconstruction of arbitrary-shaped long slender continuum bodies, such as guidewires, catheters and other soft continuum manipulators, is essential for accurate mechanical simulation. However…

Read Paper →

Engineering Preprint PDF DOI

HSC-VLA: Hierarchical Scene-Clearing for Robust Bimanual Manipulation in Dense Clutter

Zhen Liu, Xinyu Ning, Zhe Hu, XinXin Xie, Yitong Liu, Zhongzhu Pu · 2026

Modern Vision--Language--Action models often suffer from critical instruction-following failures in high-density manipulation environments, where task-irrelevant visual clutter dilutes attention, corr…

Read Paper →

Browse Research Papers

Skill-Evolving Grounded Reasoning for Free-Text Promptable 3D Medical Image Segmentation

UniGround: Universal 3D Visual Grounding via Training-Free Scene Parsing

SaiVLA-0: Cerebrum--Pons--Cerebellum Tripartite Architecture for Compute-Aware Vision-Language-Action

Towards Human-Like Manipulation through RL-Augmented Teleoperation and Mixture-of-Dexterous-Experts VLA

Language-Invariant Multilingual Speaker Verification for the TidyVoice 2026 Challenge

See and Switch: Vision-Based Branching for Interactive Robot-Skill Programming

AffordGrasp: Cross-Modal Diffusion for Affordance-Aware Grasp Synthesis

NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving

RoboRouter: Training-Free Policy Routing for Robotic Manipulation

Viewpoint-Agnostic Grasp Pipeline using VLM and Partial Observations

Relating Reinforcement Learning to Dynamic Programming-Based Planning

Reasoning Knowledge-Gap in Drone Planning via LLM-based Active Elicitation

Uncertainty Mitigation and Intent Inference: A Dual-Mode Human-Machine Joint Planning System

Temperature-Aware Scheduling of LLM Inference in Large-Scale Geo-Distributed Edge Data Centers with Distributed Optimization

AeroPlace-Flow: Language-Grounded Object Placement for Aerial Manipulators via Visual Foresight and Object Flow

Secure and Robust Beamforming Design for STAR-RIS-aided MU-MIMO ISAC Systems

AtomicVLA: Unlocking the Potential of Atomic Skill Learning in Robots

TempoFit: Plug-and-Play Layer-Wise Temporal KV Memory for Long-Horizon Vision-Language-Action Manipulation

ACCURATE: Arbitrary-shaped Continuum Reconstruction Under Robust Adaptive Two-view Estimation

HSC-VLA: Hierarchical Scene-Clearing for Robust Bimanual Manipulation in Dense Clutter

Browse by Category

Research Type

Publish Your Research