Programming Languages in Engineering — Research Repository

Engineering Preprint PDF DOI

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

Haoran Yuan, Weigang Yi, Zhenyu Zhang, Wendi Chen, Yuchen Mo, Jiashi Yin, Xinzhuo Li, Xiangyu Zeng, Chuan Wen, Cewu Lu, Katherine Driggs-Campbell, Ismini Lourentzou · 2026

Video-Action Models (VAMs) have emerged as a promising framework for embodied intelligence, learning implicit world dynamics from raw video streams to produce temporally consistent action predictions.…

Read Paper →

Engineering Preprint PDF DOI

Customized User Plane Processing via Code Generating AI Agents for Next Generation Mobile Networks

Xiaowen Ma, Onur Ayan, Yunpu Ma, Xueli An · 2026

Generative AI is envisioned to have a crucial impact on next generation mobile networking, making the sixth generation (6G) system considerably more autonomous, flexible, and adaptive than its predece…

Read Paper →

Engineering Preprint PDF DOI

A Multimodal Framework for Human-Multi-Agent Interaction

Shaid Hasan, Breenice Lee, Sujan Sarker, Tariq Iqbal · 2026

Human-robot interaction is increasingly moving toward multi-robot, socially grounded environments. Existing systems struggle to integrate multimodal perception, embodied expression, and coordinated de…

Read Paper →

Engineering Preprint PDF DOI

Rewriting TTS Inference Economics: Lightning V2 on Tenstorrent Achieves 4x Lower Cost Than NVIDIA L40S

Ranjith M. S., Akshat Mandloi, Sudarshan Kamath · 2026

Text-to-Speech (TTS) models are significantly more numerically fragile than Large Language Models (LLMs) due to their continuous waveform generation and perceptual sensitivity to small numerical pertu…

Read Paper →

Engineering Preprint PDF DOI

Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

Saurabh Kataria, Xiao Hu · 2026

Audio-Language Models (ALMs) are making strides in understanding speech and non-speech audio. However, domain-specialist Foundation Models (FMs) remain the best for closed-ended speech processing task…

Read Paper →

Engineering Preprint PDF DOI

Task-Aware Positioning for Improvisational Tasks in Mobile Construction Robots via an AI Agent with Multi-LMM Modules

Seongju Jang, Francis Baek, SangHyun Lee · 2026

Due to the ever-changing nature of construction, many tasks on sites occur in an improvisational manner. Existing mobile construction robot studies remain limited in addressing improvisational tasks, …

Read Paper →

Engineering Preprint PDF DOI

Agile-VLA: Few-Shot Industrial Pose Rectification via Implicit Affordance Anchoring

Teng Yan, Zhengyang Pei, Chengyu Shi, Yue Yu, Yikun Chen, Zilong Zhu, Zelin Fang, Kaile Guo, Zihang Wang, Peigen Tian, Bingzhuo Zhong · 2026

Deploying Vision-Language-Action (VLA) models on resource-constrained edge platforms encounters a fundamental conflict between high-latency semantic inference and the high-frequency control required f…

Read Paper →

Engineering Preprint PDF DOI

Grounding Sim-to-Real Generalization in Dexterous Manipulation: An Empirical Study with Vision-Language-Action Models

Ruixing Jin, Zicheng Zhu, Ruixiang Ouyang, Sheng Xu, Bo Yue, Zhizheng Wu, Guiliang Liu · 2026

Learning a generalist control policy for dexterous manipulation typically relies on large-scale datasets. Given the high cost of real-world data collection, a practical alternative is to generate synt…

Read Paper →

Engineering Preprint PDF DOI

Retrieval-Guided Photovoltaic Inventory Estimation from Satellite Imagery for Distribution Grid Planning

Muhao Guo, Lihao Mai, Erik Blasch, Jafarali Parol, Turki Rakan, Yang Weng · 2026

The rapid expansion of distributed rooftop photovoltaic (PV) systems introduces increasing uncertainty in distribution grid planning, hosting capacity assessment, and voltage regulation. Reliable esti…

Read Paper →

Engineering Preprint PDF DOI

CATNAV: Cached Vision-Language Traversability for Efficient Zero-Shot Robot Navigation

Aditya Potnis, Francisco Affonso, Shreya Gummadi, Naveen Kumar Uppalapati, Girish Chowdhary · 2026

Navigating unstructured environments requires assessing traversal risk relative to a robot's physical capabilities, a challenge that varies across embodiments. We present CATNAV, a cost-aware traversa…

Read Paper →

Engineering Preprint PDF DOI

SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation

Ruisen Tu, Arth Shukla, Sohyun Yoo, Xuanlin Li, Junxi Li, Jianwen Xie, Hao Su, Zhuowen Tu · 2026

Vision-Language-Action (VLA) models show promise for robotic control, yet performance in complex household environments remains sub-optimal. Mobile manipulation requires reasoning about global scene l…

Read Paper →

Engineering Preprint PDF DOI

Universal Formula Families for Safe Stabilization of Single-Input Nonlinear Systems

Bo Wang, Miroslav Krstic · 2026

We develop an optimization-free framework for safe stabilization of single-input control-affine nonlinear systems with a given control Lyapunov function (CLF) and a given control barrier function (CBF…

Read Paper →

Engineering Preprint PDF DOI

GIFT: Generalizing Intent for Flexible Test-Time Rewards

Fin Amin, Nathaniel Dennler, Andreea Bobu · 2026

Robots learn reward functions from user demonstrations, but these rewards often fail to generalize to new environments. This failure occurs because learned rewards latch onto spurious correlations in …

Read Paper →

Engineering Preprint PDF DOI

TrustTrade: Human-Inspired Selective Consensus Reduces Decision Uncertainty in LLM Trading Agents

Minghan Li, Rachel Gonsalves, Weiyue Li, Sunghoon Yoon, Mengyu Wang · 2026

Large language models (LLMs) are increasingly deployed as autonomous agents in financial trading. However, they often exhibit a hazardous behavioral bias that we term uniform trust, whereby retrieved …

Read Paper →

Engineering Preprint PDF DOI

CaP-X: A Framework for Benchmarking and Improving Coding Agents for Robot Manipulation

Max Fu, Justin Yu, Karim El-Refai, Ethan Kou, Haoru Xue, Huang Huang, Wenli Xiao, Guanzhi Wang, Fei-Fei Li, Guanya Shi, Jiajun Wu, Shankar Sastry, Yuke Zhu, Ken Goldberg, Linxi "Jim" Fan · 2026

"Code-as-Policy" considers how executable code can complement data-intensive Vision-Language-Action (VLA) methods, yet their effectiveness as autonomous controllers for embodied manipulation remains u…

Read Paper →

Engineering Preprint PDF DOI

UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos

Gu Zhang, Qicheng Xu, Haozhe Zhang, Jianhan Ma, Long He, Yiming Bao, Zeyu Ping, Zhecheng Yuan, Chenhao Lu, Chengbo Yuan, Tianhai Liang, Xiaoyu Tian, Maanping Shao, Feihong Zhang, Mingyu Ding, Yang Gao, Hao Zhao, Hang Zhao, Huazhe Xu · 2026

Dexterous manipulation remains challenging due to the cost of collecting real-robot teleoperation data, the heterogeneity of hand embodiments, and the high dimensionality of control. We present UniDex…

Read Paper →

Engineering Preprint PDF DOI

Closed-Loop Verbal Reinforcement Learning for Task-Level Robotic Planning

Dmitrii Plotnikov, Iaroslav Kolomiets, Dmitrii Maliukov, Dmitrij Kosenkov, Daniia Zinniatullina, Artem Trandofilov, Georgii Gazaryan, Kirill Bogatikov, Timofei Kozlov, Igor Duchinskii, Mikhail Konenkov, Miguel Altamirano Cabrera, Dzmitry Tsetserukou · 2026

We propose a new Verbal Reinforcement Learning (VRL) framework for interpretable task-level planning in mobile robotic systems operating under execution uncertainty. The framework follows a closed-loo…

Read Paper →

Engineering Preprint PDF DOI

ROBOGATE: Adaptive Failure Discovery for Safe Robot Policy Deployment via Two-Stage Boundary-Focused Sampling

Azuki Kim · 2026

Deploying learned robot manipulation policies in industrial settings requires rigorous pre-deployment validation, yet exhaustive testing across high-dimensional parameter spaces is intractable. We pre…

Read Paper →

Engineering Preprint PDF DOI

Programming Manufacturing Robots with Imperfect AI: LLMs as Tuning Experts for FDM Print Configuration Selection

Ekta U. Samani, Christopher G. Atkeson · 2026

We use fused deposition modeling (FDM) 3D printing as a case study of how manufacturing robots can use imperfect AI to acquire process expertise. In FDM, print configuration strongly affects output qu…

Read Paper →

Engineering Preprint PDF DOI

Do World Action Models Generalize Better than VLAs? A Robustness Study

Zhanguang Zhang, Zhiyuan Li, Behnam Rahmati, Rui Heng Yang, Yintao Ma, Amir Rasouli, Sajjad Pakdamansavoji, Yangzheng Wu, Lingfeng Zhang, Tongtong Cao, Feng Wen, Xinyu Wang, Xingyue Quan, Yingxue Zhang · 2026

Robot action planning in the real world is challenging as it requires not only understanding the current state of the environment but also predicting how it will evolve in response to actions. Vision-…

Read Paper →

Browse Research Papers

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

Customized User Plane Processing via Code Generating AI Agents for Next Generation Mobile Networks

A Multimodal Framework for Human-Multi-Agent Interaction

Rewriting TTS Inference Economics: Lightning V2 on Tenstorrent Achieves 4x Lower Cost Than NVIDIA L40S

Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

Task-Aware Positioning for Improvisational Tasks in Mobile Construction Robots via an AI Agent with Multi-LMM Modules

Agile-VLA: Few-Shot Industrial Pose Rectification via Implicit Affordance Anchoring

Grounding Sim-to-Real Generalization in Dexterous Manipulation: An Empirical Study with Vision-Language-Action Models

Retrieval-Guided Photovoltaic Inventory Estimation from Satellite Imagery for Distribution Grid Planning

CATNAV: Cached Vision-Language Traversability for Efficient Zero-Shot Robot Navigation

SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation

Universal Formula Families for Safe Stabilization of Single-Input Nonlinear Systems

GIFT: Generalizing Intent for Flexible Test-Time Rewards

TrustTrade: Human-Inspired Selective Consensus Reduces Decision Uncertainty in LLM Trading Agents

CaP-X: A Framework for Benchmarking and Improving Coding Agents for Robot Manipulation

UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos

Closed-Loop Verbal Reinforcement Learning for Task-Level Robotic Planning

ROBOGATE: Adaptive Failure Discovery for Safe Robot Policy Deployment via Two-Stage Boundary-Focused Sampling

Programming Manufacturing Robots with Imperfect AI: LLMs as Tuning Experts for FDM Print Configuration Selection

Do World Action Models Generalize Better than VLAs? A Robustness Study

Browse by Category

Research Type

Publish Your Research