Programming Languages in Engineering — Research Repository

Engineering Preprint PDF DOI

Humanizing Robot Gaze Shifts: A Framework for Natural Gaze Shifts in Humanoid Robots

Jingchao Wei, Jingkai Qin, Yuxiao Cao, Jingcheng Huang, Xiangrui Zeng, Min Li, Zhouping Yin · 2026

Leveraging auditory and visual feedback for attention reorientation is essential for natural gaze shifts in social interaction. However, enabling humanoid robots to perform natural and context-appropr…

Read Paper →

Engineering Preprint PDF DOI

Quality of Automatic Speech Recognition -- Polish Language case study -- from Wav2Vec to Scribe ElevenLabs

Marcin Pietron, Szymon Piorkowski, Kamil Faber, Dominik Zurek, Micha{l} Karwatowski, Jerzy Duda, Hubert Zielinski, Piotr Lipnicki, Miko{l}aj Leszczuk · 2026

This article concerns comparative studies on the Automatic Speech Recognition (ASR) model incorporated with the Large Language Model (LLM) used for medical interviews. The proposed solution is tested …

Read Paper →

Engineering Preprint PDF DOI

Therapist-Robot-Patient Physical Interaction is Worth a Thousand Words: Enabling Intuitive Therapist Guidance via Remote Haptic Control

Beatrice Luciani, Alex van den Berg, Matti Lang, Alexandre L. Ratschat, Laura Marchal-Crespo · 2026

Robotic systems can enhance the amount and repeatability of physically guided motor training. Yet their real-world adoption is limited, partly due to non-intuitive trainer/therapist-trainee/patient in…

Read Paper →

Engineering Preprint PDF DOI

Joint-Aligned Latent Action: Towards Scalable VLA Pretraining in the Wild

Hao Luo, Ye Wang, Wanpeng Zhang, Haoqi Yuan, Yicheng Feng, Haiweng Xu, Sipeng Zheng, Zongqing Lu · 2026

Despite progress, Vision-Language-Action models (VLAs) are limited by a scarcity of large-scale, diverse robot data. While human manipulation videos offer a rich alternative, existing methods are forc…

Read Paper →

Engineering Preprint PDF DOI

Deep Accurate Solver for the Geodesic Problem

Saar Huberman, Amit Bracha, Ron Kimmel · 2026

A common approach to compute distances on continuous surfaces is by considering a discretized polygonal mesh approximating the surface and estimating distances on the polygon. We show that exact geode…

Read Paper →

Engineering Preprint PDF DOI

Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach

Xu Yang, Chenhui Lin, Xiang Ma, Dong Liu, Ran Zheng, Haotian Liu, Wenchuan Wu · 2026

The growing integration of distributed photovoltaics (PVs) into active distribution networks (ADNs) has exacerbated operational challenges, making it imperative to coordinate diverse equipment to miti…

Read Paper →

Engineering Preprint PDF DOI

Hierarchical LLM-Based Multi-Agent Framework with Prompt Optimization for Multi-Robot Task Planning

Tomoya Kawabe, Rin Takano · 2026

Multi-robot task planning requires decomposing natural-language instructions into executable actions for heterogeneous robot teams. Conventional Planning Domain Definition Language (PDDL) planners pro…

Read Paper →

Engineering Preprint PDF DOI

Self-Correcting VLA: Online Action Refinement via Sparse World Imagination

Chenyv Liu, Wentao Tan, Lei Zhu, Fengling Li, Jingjing Li, Guoli Yang, Heng Tao Shen · 2026

Standard vision-language-action (VLA) models rely on fitting statistical data priors, limiting their robust understanding of underlying physical dynamics. Reinforcement learning enhances physical grou…

Read Paper →

Engineering Preprint PDF DOI

SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints

Hyungmin Kim, Hobeom Jeon, Dohyung Kim, Minsu Jang, Jeahong Kim · 2026

Embodied Task Planning with large language models faces safety challenges in real-world environments, where partial observability and physical constraints must be respected. Existing benchmarks often …

Read Paper →

Engineering Preprint PDF DOI

LiLo-VLA: Compositional Long-Horizon Manipulation via Linked Object-Centric Policies

Yue Yang, Shuo Cheng, Yu Fang, Homanga Bharadhwaj, Mingyu Ding, Gedas Bertasius, Daniel Szafir · 2026

General-purpose robots must master long-horizon manipulation, defined as tasks involving multiple kinematic structure changes (e.g., attaching or detaching objects) in unstructured environments. While…

Read Paper →

Engineering Preprint PDF DOI

VLA Knows Its Limits

Haoxuan Wang, Gengyu Zhang, Yan Yan, Ramana Rao Kompella, Gaowen Liu · 2026

Action chunking has recently emerged as a standard practice in flow-based Vision-Language-Action (VLA) models. However, the effect and choice of the execution horizon - the number of actions to be exe…

Read Paper →

Engineering Preprint PDF DOI

NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

Guoan Wang, Shihao Yang, Jun-en Ding, Hao Zhu, Feng Liu · 2026

Electroencephalography (EEG) provides a non-invasive window into neural dynamics at high temporal resolution and plays a pivotal role in clinical neuroscience research. Despite this potential, prevail…

Read Paper →

Engineering Preprint PDF DOI

ActionReasoning: Robot Action Reasoning in 3D Space with LLM for Robotic Brick Stacking

Guangming Wang, Qizhen Ying, Yixiong Jing, Olaf Wysocki, Brian Sheil · 2026

Classical robotic systems typically rely on custom planners designed for constrained environments. While effective in restricted settings, these systems lack generalization capabilities, limiting the …

Read Paper →

Engineering Preprint PDF DOI

HALO: A Unified Vision-Language-Action Model for Embodied Multimodal Chain-of-Thought Reasoning

Quanxin Shou, Fangqi Zhu, Shawn Chen, Puxin Yan, Zhengyang Yan, Yikun Miao, Xiaoyi Pang, Zicong Hong, Ruikai Shi, Hao Huang, Jie Zhang, Song Guo · 2026

Vision-Language-Action (VLA) models have shown strong performance in robotic manipulation, but often struggle in long-horizon or out-of-distribution scenarios due to the lack of explicit mechanisms fo…

Read Paper →

Engineering Preprint PDF DOI

Notes-to-Self: Scratchpad Augmented VLAs for Memory Dependent Manipulation Tasks

Sanjay Haresh, Daniel Dijkman, Apratim Bhattacharyya, Roland Memisevic · 2026

Many dexterous manipulation tasks are non-markovian in nature, yet little attention has been paid to this fact in the recent upsurge of the vision-language-action (VLA) paradigm. Although they are suc…

Read Paper →

Engineering Preprint PDF DOI

IG-RFT: An Interaction-Guided RL Framework for VLA Models in Long-Horizon Robotic Manipulation

Zhian Su, Weijie Kong, Haonan Dong, Huixu Dong · 2026

Vision-Language-Action (VLA) models have demonstrated significant potential for generalist robotic policies; however, they struggle to generalize to long-horizon complex tasks in novel real-world doma…

Read Paper →

Engineering Preprint PDF DOI

Grid-Mind: An LLM-Orchestrated Multi-Fidelity Agent for Automated Connection Impact Assessment

Mohamed Shamseldein · 2026

Large language models (LLMs) have demonstrated remarkable tool-use capabilities, yet their application to power system operations remains largely unexplored. This paper presents Grid-Mind, a domain-sp…

Read Paper →

Engineering Preprint PDF DOI

BFA++: Hierarchical Best-Feature-Aware Token Prune for Multi-View Vision Language Action Model

Haosheng Li, Weixin Mao, Zihan Lan, Hongwei Xiong, Hongan Wang, Chenyang Si, Ziwei Liu, Xiaoming Deng, Hua Chen · 2026

Vision-Language-Action (VLA) models have achieved significant breakthroughs by leveraging Large Vision Language Models (VLMs) to jointly interpret instructions and visual inputs. However, the substant…

Read Paper →

Engineering Preprint PDF DOI

Strategy-Supervised Autonomous Laparoscopic Camera Control via Event-Driven Graph Mining

Keyu Zhou, Peisen Xu, Yahao Wu, Jiming Chen, Gaofeng Li, Shunlei Li · 2026

Autonomous laparoscopic camera control must maintain a stable and safe surgical view under rapid tool-tissue interactions while remaining interpretable to surgeons. We present a strategy-grounded fram…

Read Paper →

Engineering Preprint PDF DOI

Application of Large Language Models for Container Throughput Forecasting: Incorporating Contextual Information in Port Logistics

Minseop Kim, Jaeeun Kwon, Hanbyeol Park, Kikun Park, Taekhyun Park, Hyerim Bae · 2026

Recent advancements in generative artificial intelligence (AI) have demonstrated its substantial potential in various fields. However, its application in port logistics remains underexplored. Ports ar…

Read Paper →

Browse Research Papers

Humanizing Robot Gaze Shifts: A Framework for Natural Gaze Shifts in Humanoid Robots

Quality of Automatic Speech Recognition -- Polish Language case study -- from Wav2Vec to Scribe ElevenLabs

Therapist-Robot-Patient Physical Interaction is Worth a Thousand Words: Enabling Intuitive Therapist Guidance via Remote Haptic Control

Joint-Aligned Latent Action: Towards Scalable VLA Pretraining in the Wild

Deep Accurate Solver for the Geodesic Problem

Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach

Hierarchical LLM-Based Multi-Agent Framework with Prompt Optimization for Multi-Robot Task Planning

Self-Correcting VLA: Online Action Refinement via Sparse World Imagination

SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints

LiLo-VLA: Compositional Long-Horizon Manipulation via Linked Object-Centric Policies

VLA Knows Its Limits

NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

ActionReasoning: Robot Action Reasoning in 3D Space with LLM for Robotic Brick Stacking

HALO: A Unified Vision-Language-Action Model for Embodied Multimodal Chain-of-Thought Reasoning

Notes-to-Self: Scratchpad Augmented VLAs for Memory Dependent Manipulation Tasks

IG-RFT: An Interaction-Guided RL Framework for VLA Models in Long-Horizon Robotic Manipulation

Grid-Mind: An LLM-Orchestrated Multi-Fidelity Agent for Automated Connection Impact Assessment

BFA++: Hierarchical Best-Feature-Aware Token Prune for Multi-View Vision Language Action Model

Strategy-Supervised Autonomous Laparoscopic Camera Control via Event-Driven Graph Mining

Application of Large Language Models for Container Throughput Forecasting: Incorporating Contextual Information in Port Logistics

Browse by Category

Research Type

Publish Your Research