Programming Languages in Engineering — Research Repository

Engineering Preprint PDF DOI

StemVLA:An Open-Source Vision-Language-Action Model with Future 3D Spatial Geometry Knowledge and 4D Historical Representation

Jiasong Xiao, Yutao She, Kai Li, Yuyang Sha, Ziang Cheng, Ziang Tong · 2026

Vision-language-action (VLA) models integrate visual observations and language instructions to predict robot actions, demonstrating promising generalization in manipulation tasks. However, most existi…

Read Paper →

Engineering Preprint PDF DOI

SAGE-LLM: Towards Safe and Generalizable LLM Controller with Fuzzy-CBF Verification and Graph-Structured Knowledge Retrieval for UAV Decision

Wenzhe Zhao, Yang Zhao, Ganchao Liu, Zhiyu Jiang, Dandan Ma, Zihao Li, Xuelong Li · 2026

In UAV dynamic decision, complex and variable hazardous factors pose severe challenges to the generalization capability of algorithms. Despite offering semantic understanding and scene generalization,…

Read Paper →

Engineering Preprint PDF DOI

FAVLA: A Force-Adaptive Fast-Slow VLA model for Contact-Rich Robotic Manipulation

Yao Li, Peiyuan Tang, Wuyang Zhang, Chengyang Zhu, Yifan Duan, Weikai Shi, Xiaodong Zhang, Zijiang Yang, Jianmin Ji, Yanyong Zhang · 2026

Force/torque feedback can substantially improve Vision-Language-Action (VLA) models on contact-rich manipulation, but most existing approaches fuse all modalities at a single operating frequency. This…

Read Paper →

Engineering Preprint PDF DOI

KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

Zebin Yang, Tong Xie, Baotong Lu, Shaoshan Liu, Bo Yu, Meng Li · 2026

Memory-augmented Large Language Models (LLMs) have demonstrated remarkable capability for complex and long-horizon embodied planning. By keeping track of past experiences and environmental states, mem…

Read Paper →

Engineering Preprint PDF DOI

VCA: Vision-Click-Action Framework for Precise Manipulation of Segmented Objects in Target Ambiguous Environments

Donggeon Kim, Seungwon Jan, Hyeonjun Park, Daegyu Lim · 2026

The reliance on language in Vision-Language-Action (VLA) models introduces ambiguity, cognitive overhead, and difficulties in precise object identification and sequential task execution, particularly …

Read Paper →

Engineering Preprint PDF DOI

Embedding Morphology into Transformers for Cross-Robot Policy Learning

Kei Suzuki, Jing Liu, Ye Wang, Chiori Hori, Matthew Brand, Diego Romeres, Toshiaki Koike-Akino · 2026

Cross-robot policy learning -- training a single policy to perform well across multiple embodiments -- remains a central challenge in robot learning. Transformer-based policies, such as vision-languag…

Read Paper →

Engineering Preprint PDF DOI

TaCarla: A comprehensive benchmarking dataset for end-to-end autonomous driving

Tugrul Gorgulu, Atakan Dag, M. Esat Kalfaoglu, Halil Ibrahim Kuru, Baris Can Cam, Halil Ibrahim Ozturk, Ozsel Kilinc · 2026

Collecting a high-quality dataset is a critical task that demands meticulous attention to detail, as overlooking certain aspects can render the entire dataset unusable. Autonomous driving challenges r…

Read Paper →

Engineering Preprint PDF DOI

Signal Temporal Logic Verification and Synthesis Using Deep Reachability Analysis and Layered Control Architecture

Joonwon Choi, Kartik Anand Pant, Youngim Nam, Henry Hellmann, Karthik Nune, Inseok Hwang · 2026

We propose a signal temporal logic (STL)-based framework that rigorously verifies the feasibility of a mission described in STL and synthesizes control to safely execute it. The proposed framework ens…

Read Paper →

Engineering Preprint PDF DOI

DySL-VLA: Efficient Vision-Language-Action Model Inference via Dynamic-Static Layer-Skipping for Robot Manipulation

Zebin Yang, Yijiahao Qi, Tong Xie, Bo Yu, Shaoshan Liu, Meng Li · 2026

Vision-Language-Action (VLA) models have shown remarkable success in robotic tasks like manipulation by fusing a language model's reasoning with a vision model's 3D understanding. However, their high …

Read Paper →

Engineering Preprint PDF DOI

Rethinking the Practicality of Vision-language-action Model: A Comprehensive Benchmark and An Improved Baseline

Wenxuan Song, Jiayi Chen, Xiaoquan Sun, Huashuo Lei, Yikai Qin, Wei Zhao, Pengxiang Ding, Han Zhao, Tongxin Wang, Pengxu Hou, Zhide Zhong, Haodong Yan, Donglin Wang, Jun Ma, Haoang Li · 2026

Vision-Language-Action (VLA) models have emerged as a generalist robotic agent. However, existing VLAs are hindered by excessive parameter scales, prohibitive pre-training requirements, and limited ap…

Read Paper →

Engineering Preprint PDF DOI

Metamorphic Testing of Vision-Language Action-Enabled Robots

Pablo Valle, Sergio Segura, Shaukat Ali, Aitor Arrieta · 2026

Vision-Language-Action (VLA) models are multimodal robotic task controllers that, given an instruction and visual inputs, produce a sequence of low-level control actions (or motor commands) enabling a…

Read Paper →

Engineering Preprint PDF DOI

SignVLA: A Gloss-Free Vision-Language-Action Framework for Real-Time Sign Language-Guided Robotic Manipulation

Xinyu Tan, Ningwei Bai, Harry Gardener, Zhengyang Zhong, Luoyu Zhang, Liuhaichen Yang, Zhekai Duan, Monkgogi Galeitsiwe, Zezhi Tang · 2026

We present, to our knowledge, the first sign language-driven Vision-Language-Action (VLA) framework for intuitive and inclusive human-robot interaction. Unlike conventional approaches that rely on glo…

Read Paper →

Engineering Preprint PDF DOI

When to Act, Ask, or Learn: Uncertainty-Aware Policy Steering

Jessie Yuan, Yilin Wu, Andrea Bajcsy · 2026

Policy steering is an emerging way to adapt robot behaviors at deployment-time: a learned verifier analyzes low-level action samples proposed by a pre-trained policy (e.g., diffusion policy) and selec…

Read Paper →

Engineering Preprint PDF DOI

Full Waveform Inversion using the Wasserstein metric for ultrasound transducer array based NDT

Daniel Rossato, Thiago Alberto Rigo Passarin, Gustavo Pinto Pires, Daniel Rodrigues Pipa · 2026

Ultrasonic imaging methods often assume linear direct models, while in reality, many nonlinear phenomena are present, e.g. multiple reflections. A family of imaging methods called Full Waveform Invers…

Read Paper →

Engineering Preprint PDF DOI

Behavioral Cloning for Robotic Connector Assembly: An Empirical Study

Andreas Kernbach, Daniel Bargmann, Werner Kraus, Marco F. Huber · 2026

Automating the assembly of wire harnesses is challenging in automotive, electrical cabinet, and aircraft production, particularly due to deformable cables and a high variance in connector geometries. …

Read Paper →

Engineering Preprint PDF DOI

Trust in Autonomous Human--Robot Collaboration: Effects of Responsive Interaction Policies

Shauna Heron, Meng Cheng Lau · 2026

Trust plays a central role in human--robot collaboration, yet its formation is rarely examined under the constraints of fully autonomous interaction. This pilot study investigated how interaction poli…

Read Paper →

Engineering Preprint PDF DOI

Transmission Delay Minimization for NOMA-Based F-RANs

Yuan Ai, Xidong Mu, Pengbo Si, Yuanwei Liu · 2026

A novel non-orthogonal multiple access (NOMA) based low-delay service framework is proposed for fog radio access networks (F-RANs). Fog access points (FAPs) leverage NOMA for local delivery of cached …

Read Paper →

Engineering Preprint PDF DOI

DECODE: Dual-Enhanced Conditioned Diffusion for EEG Forecasting

Mehran Shabanpour, Sadaf Khademi, Konstantinos N Plataniotis, Arash Mohammadi · 2026

Forecasting Electroncephalography (EEG) signals during cognitive events remains a fundamental challenge in neuroscience and Brain-Computer Interfaces (BCIs), as existing methods struggle to capture bo…

Read Paper →

Engineering Preprint PDF DOI

TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition

Cheng-Yeh Yang, Chien-Chun Wang, Li-Wei Chen, Hung-Shin Lee, Hsin-Min Wang, Berlin Chen · 2026

Low-resource automatic speech recognition (ASR) continues to pose significant challenges, primarily due to the limited availability of transcribed data for numerous languages. While a wealth of spoken…

Read Paper →

Engineering Preprint PDF DOI

World Guidance: World Modeling in Condition Space for Action Generation

Yue Su, Sijin Chen, Haixin Shi, Mingyu Liu, Zhengshen Zhang, Ningyuan Huang, Weiheng Zhong, Zhengbang Zhu, Yuxiao Liu, Xihui Liu · 2026

Leveraging future observation modeling to facilitate action generation presents a promising avenue for enhancing the capabilities of Vision-Language-Action (VLA) models. However, existing approaches s…

Read Paper →

Browse Research Papers

StemVLA:An Open-Source Vision-Language-Action Model with Future 3D Spatial Geometry Knowledge and 4D Historical Representation

SAGE-LLM: Towards Safe and Generalizable LLM Controller with Fuzzy-CBF Verification and Graph-Structured Knowledge Retrieval for UAV Decision

FAVLA: A Force-Adaptive Fast-Slow VLA model for Contact-Rich Robotic Manipulation

KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

VCA: Vision-Click-Action Framework for Precise Manipulation of Segmented Objects in Target Ambiguous Environments

Embedding Morphology into Transformers for Cross-Robot Policy Learning

TaCarla: A comprehensive benchmarking dataset for end-to-end autonomous driving

Signal Temporal Logic Verification and Synthesis Using Deep Reachability Analysis and Layered Control Architecture

DySL-VLA: Efficient Vision-Language-Action Model Inference via Dynamic-Static Layer-Skipping for Robot Manipulation

Rethinking the Practicality of Vision-language-action Model: A Comprehensive Benchmark and An Improved Baseline

Metamorphic Testing of Vision-Language Action-Enabled Robots

SignVLA: A Gloss-Free Vision-Language-Action Framework for Real-Time Sign Language-Guided Robotic Manipulation

When to Act, Ask, or Learn: Uncertainty-Aware Policy Steering

Full Waveform Inversion using the Wasserstein metric for ultrasound transducer array based NDT

Behavioral Cloning for Robotic Connector Assembly: An Empirical Study

Trust in Autonomous Human--Robot Collaboration: Effects of Responsive Interaction Policies

Transmission Delay Minimization for NOMA-Based F-RANs

DECODE: Dual-Enhanced Conditioned Diffusion for EEG Forecasting

TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition

World Guidance: World Modeling in Condition Space for Action Generation

Browse by Category

Research Type

Publish Your Research