Programming Languages in Engineering — Research Repository

Engineering Preprint PDF DOI

Semantic Satellite Communications for Synchronized Audiovisual Reconstruction

Fangyu Liu, Peiwen Jiang, Wenjin Wang, Chao-Kai Wen, Xiao Li, Shi Jin · 2026

Satellite communications face severe bottlenecks in supporting high-fidelity synchronized audiovisual services, as conventional schemes struggle with cross-modal coherence under fluctuating channel co…

Read Paper →

Engineering Preprint PDF DOI

FutureVLA: Joint Visuomotor Prediction for Vision-Language-Action Model

Xiaoxu Xu, Hao Li, Jinhui Ye, Yilun Chen, Jia Zeng, Xinyi Chen, Linning Xu, Dahua Lin, Weixin Li, Jiangmiao Pang · 2026

Predictive foresight is important to intelligent embodied agents. Since the motor execution of a robot is intrinsically constrained by its visual perception of environmental geometry, effectively anti…

Read Paper →

Engineering Preprint PDF DOI

Parallel-in-Time Nonlinear Optimal Control via GPU-native Sequential Convex Programming

Yilin Zou, Zhong Zhang, Maxime Robic, Fanghua Jiang · 2026

Real-time trajectory optimization for nonlinear constrained autonomous systems is critical and typically performed by CPU-based sequential solvers. Specifically, reliance on global sparse linear algeb…

Read Paper →

Engineering Preprint PDF DOI

OnFly: Onboard Zero-Shot Aerial Vision-Language Navigation toward Safety and Efficiency

Guiyong Zheng, Yueting Ban, Mingjie Zhang, Juepeng Zheng, Boyu Zhou · 2026

Aerial vision-language navigation (AVLN) enables UAVs to follow natural-language instructions in complex 3D environments. However, existing zero-shot AVLN methods often suffer from unstable single-str…

Read Paper →

Engineering Preprint PDF DOI

Cybo-Waiter: A Physical Agentic Framework for Humanoid Whole-Body Locomotion-Manipulation

Peng Ren, Haoyang Ge, Chuan Qi, Cong Huang, Hong Li, Jiang Zhao, Pei Chi, Kai Chen · 2026

Robots are increasingly expected to execute open ended natural language requests in human environments, which demands reliable long horizon execution under partial observability. This is especially ch…

Read Paper →

Engineering Preprint PDF DOI

AdaClearGrasp: Learning Adaptive Clearing for Zero-Shot Robust Dexterous Grasping in Densely Cluttered Environments

Zixuan Chen, Wenquan Zhang, Jing Fang, Ruiming Zeng, Zhixuan Xu, Yiwen Hou, Xinke Wang, Jieqi Shi, Jing Huo, Yang Gao · 2026

In densely cluttered environments, physical interference, visual occlusions, and unstable contacts often cause direct dexterous grasping to fail, while aggressive singulation strategies may compromise…

Read Paper →

Engineering Preprint PDF DOI

Synthetic Data Domain Adaptation for ASR via LLM-based Text and Phonetic Respelling Augmentation

Natsuo Yamashita, Koichi Nagatsuka, Hiroaki Kokubo, Kota Dohi, Tuan Vu Ho · 2026

End-to-end automatic speech recognition often degrades on domain-specific data due to scarce in-domain resources. We propose a synthetic-data-based domain adaptation framework with two contributions: …

Read Paper →

Engineering Preprint PDF DOI

DepthCache: Depth-Guided Training-Free Visual Token Merging for Vision-Language-Action Model Inference

Yuquan Li, Lianjie Ma, Han Ding, Lijun Zhu · 2026

Vision-Language-Action (VLA) models enable generalist robotic manipulation but suffer from high inference latency. This bottleneck stems from the massive number of visual tokens processed by large lan…

Read Paper →

Engineering Preprint PDF DOI

DiT4DiT: Jointly Modeling Video Dynamics and Actions for Generalizable Robot Control

Teli Ma, Jia Zheng, Zifan Wang, Chunli Jiang, Andy Cui, Junwei Liang, Shuo Yang · 2026

Vision-Language-Action (VLA) models have emerged as a promising paradigm for robot learning, but their representations are still largely inherited from static image-text pretraining, leaving physical …

Read Paper →

Engineering Preprint PDF DOI

KnowDiffuser: A Knowledge-Guided Diffusion Planner with LLM Reasoning

Fan Ding, Xuewen Luo, Fengze Yang, Bo Yu, HwaHui Tew, Ganesh Krishnasamy, Junn Yong Loo · 2026

Recent advancements in Language Models (LMs) have demonstrated strong semantic reasoning capabilities, enabling their application in high-level decision-making for autonomous driving (AD). However, LM…

Read Paper →

Engineering Preprint PDF DOI

COHORT: Hybrid RL for Collaborative Large DNN Inference on Multi-Robot Systems Under Real-Time Constraints

Mohammad Saeid Anwar, Anuradha Ravi, Indrajeet Ghosh, Gaurav Shinde, Carl Busart, Nirmalya Roy · 2026

Large deep neural networks (DNNs), especially transformer-based and multimodal architectures, are computationally demanding and challenging to deploy on resource-constrained edge platforms like field …

Read Paper →

Engineering Preprint PDF DOI

FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System

Kaituo Xu, Yan Jia, Kai Huang, Junjie Chen, Wenpeng Li, Kun Liu, Feng-Long Xie, Xu Tang, Yao Hu · 2026

We present FireRedASR2S, a state-of-the-art industrial-grade all-in-one automatic speech recognition (ASR) system. It integrates four modules in a unified pipeline: ASR, Voice Activity Detection (VAD)…

Read Paper →

Engineering Preprint PDF DOI

Harf-Speech: A Clinically Aligned Framework for Arabic Phoneme-Level Speech Assessment

Asif Azad, MD Sadik Hossain Shanto, Mohammad Sadat Hossain, Bdour Alwuqaysi, Sabri Boughorbel, Yahya Bokhari, Abdulrhman Aljouie, Ayah Othman Sindi, Ehsan Hoque · 2026

Automated phoneme-level pronunciation assessment is vital for scalable speech therapy and language learning, yet validated tools for Arabic remain scarce. We present Harf-Speech, a modular system scor…

Read Paper →

Engineering Preprint PDF DOI

Speech Codec Probing from Semantic and Phonetic Perspectives

Xuan Shi, Chang Zeng, Tiantian Feng, Shih-Heng Wang, Jianbo Ma, Shrikanth Narayanan · 2026

Speech tokenizers are essential for connecting speech to large language models (LLMs) in multimodal systems. These tokenizers are expected to preserve both semantic and acoustic information for downst…

Read Paper →

Engineering Preprint PDF DOI

Multi-Modal Intelligent Channel Modeling: From Fine-tuned LLMs to Pre-trained Foundation Models

Lu Bai, Zengrui Han, Mingran Sun, Xiang Cheng · 2026

To meet the evolving demands of sixth-generation (6G) wireless channel modeling, such as precise prediction capability, extension capabilities, and system participation capability, multi-modal intelli…

Read Paper →

Engineering Preprint PDF DOI

Simulation-in-the-Reasoning (SiR): A Conceptual Framework for Empirically Grounded AI in Autonomous Transportation

Wuping Xin · 2026

Large Language Models (LLMs) have advanced reasoning through techniques like Chain-of-Thought (CoT). However, their reasoning largely re-mains textual and hypothetical, lacking empirical grounding in …

Read Paper →

Engineering Preprint PDF DOI

SELF-VLA: A Skill Enhanced Agentic Vision-Language-Action Framework for Contact-Rich Disassembly

Chang Liu, Sibo Tian, Xiao Liang, Minghui Zheng · 2026

Disassembly automation has long been pursued to address the growing demand for efficient and proper recovery of valuable components from the end-of-life (EoL) electronic products. Existing approaches …

Read Paper →

Engineering Preprint PDF DOI

TATIC: Task-Aware Temporal Learning for Human Intent Inference from Physical Corrections in Human-Robot Collaboration

Jiurun Song, Xiao Liang, Minghui Zheng · 2026

In human-robot collaboration (HRC), robots must adapt online to dynamic task constraints and evolving human intent. While physical corrections provide a natural, low-latency channel for operators to c…

Read Paper →

Engineering Preprint PDF DOI

Hierarchical Task Model Predictive Control for Sequential Mobile Manipulation Tasks

Xintong Du, Siqi Zhou, Angela P. Schoellig · 2026

Mobile manipulators are envisioned to serve more complex roles in people's everyday lives. With recent breakthroughs in large language models, task planners have become better at translating human ver…

Read Paper →

Engineering Preprint PDF DOI

DRAFTO: Decoupled Reduced-space and Adaptive Feasibility-repair Trajectory Optimization for Robotic Manipulators

Yichang Feng, Xiao Liang, Minghui Zheng · 2026

This paper introduces a new algorithm for trajectory optimization, Decoupled Reduced-space and Adaptive Feasibility-repair Trajectory Optimization (DRAFTO). It first constructs a constrained objective…

Read Paper →

Browse Research Papers

Semantic Satellite Communications for Synchronized Audiovisual Reconstruction

FutureVLA: Joint Visuomotor Prediction for Vision-Language-Action Model

Parallel-in-Time Nonlinear Optimal Control via GPU-native Sequential Convex Programming

OnFly: Onboard Zero-Shot Aerial Vision-Language Navigation toward Safety and Efficiency

Cybo-Waiter: A Physical Agentic Framework for Humanoid Whole-Body Locomotion-Manipulation

AdaClearGrasp: Learning Adaptive Clearing for Zero-Shot Robust Dexterous Grasping in Densely Cluttered Environments

Synthetic Data Domain Adaptation for ASR via LLM-based Text and Phonetic Respelling Augmentation

DepthCache: Depth-Guided Training-Free Visual Token Merging for Vision-Language-Action Model Inference

DiT4DiT: Jointly Modeling Video Dynamics and Actions for Generalizable Robot Control

KnowDiffuser: A Knowledge-Guided Diffusion Planner with LLM Reasoning

COHORT: Hybrid RL for Collaborative Large DNN Inference on Multi-Robot Systems Under Real-Time Constraints

FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System

Harf-Speech: A Clinically Aligned Framework for Arabic Phoneme-Level Speech Assessment

Speech Codec Probing from Semantic and Phonetic Perspectives

Multi-Modal Intelligent Channel Modeling: From Fine-tuned LLMs to Pre-trained Foundation Models

Simulation-in-the-Reasoning (SiR): A Conceptual Framework for Empirically Grounded AI in Autonomous Transportation

SELF-VLA: A Skill Enhanced Agentic Vision-Language-Action Framework for Contact-Rich Disassembly

TATIC: Task-Aware Temporal Learning for Human Intent Inference from Physical Corrections in Human-Robot Collaboration

Hierarchical Task Model Predictive Control for Sequential Mobile Manipulation Tasks

DRAFTO: Decoupled Reduced-space and Adaptive Feasibility-repair Trajectory Optimization for Robotic Manipulators

Browse by Category

Research Type

Publish Your Research