Expertini Research Research

Browse Research Papers

9,775+ open-access research outputs.

โœ• Clear
๐Ÿ” programming languages ๐Ÿ“‚ Engineering
Showing 9775 results for "programming languages" in Engineering
Engineering Preprint PDF DOI

Towards the Vision-Sound-Language-Action Paradigm: The HEAR Framework for Sound-Centric Manipulation

Chang Nie, Tianchen Deng, Guangming Wang, Zhe Liu, Hesheng Wang ยท 2026

While recent Vision-Language-Action (VLA) models have begun to incorporate audio, they typically treat sound as static pre-execution prompts or focus exclusively on human speech. This leaves a signifiโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Large Reward Models: Generalizable Online Robot Reward Generation with Vision-Language Models

Yanru Wu, Weiduo Yuan, Ang Qi, Vitor Guizilini, Jiageng Mao, Yue Wang ยท 2026

Reinforcement Learning (RL) has shown great potential in refining robotic manipulation policies, yet its efficacy remains strongly bottlenecked by the difficulty of designing generalizable reward funcโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Compact Optical Single-axis Joint Torque Sensor Using Redundant Photo-Reflectors and Quadratic-Programming Calibration

Hyun-Bin Kim, Byeong-Il Ham, Kyung-Soo Kim ยท 2026

This study proposes a non-contact photo-reflector-based joint torque sensor for precise joint-level torque control and safe physical interaction. Current-sensor-based torque estimation in many collaboโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Geometry-Aligned LLM Fine-Tuning for Sequential Narrow-Opening Planning

Al Jaber Mahmud, Xuan Wang ยท 2026

We study rigid-body motion planning through multiple sequential narrow openings, which requires long-horizon geometric reasoning because the configuration used to traverse an early opening constrains โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Safety Case Patterns for VLA-based driving systems: Insights from SimLingo

Gerhard Yu, Fuyuki Ishikawa, Oluwafemi Odu, Alvine Boaye Belle ยท 2026

Vision-Language-Action (VLA)-based driving systems represent a significant paradigm shift in autonomous driving since, by combining traffic scene understanding, linguistic interpretation, and action gโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Something from Nothing: Data Augmentation for Robust Severity Level Estimation of Dysarthric Speech

Jaesung Bae, Xiuwen Zheng, Minje Kim, Chang D. Yoo, Mark Hasegawa-Johnson ยท 2026

Dysarthric speech quality assessment (DSQA) is critical for clinical diagnostics and inclusive speech technologies. However, subjective evaluation is costly and difficult to scale, and the scarcity ofโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors

Zifan Xu, Ran Gong, Maria Vittoria Minniti, Ahmet Salih Gundogdu, Eric Rosen, Kausik Sivakumar, Riedana Yan, Zixing Wang, Di Deng, Peter Stone, Xiaohan Zhang, Karl Schmeckpeper ยท 2026

Learning generalizable and robust behavior cloning policies requires large volumes of high-quality robotics data. While human demonstrations (e.g., through teleoperation) serve as the standard source โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Embodied Foundation Models at the Edge: A Survey of Deployment Constraints and Mitigation Strategies

Utkarsh Grover, Ravi Ranjan, Mingyang Mao, Trung Tien Dong, Satvik Praveen, Zhenqi Wu, J. Morris Chang, Tinoosh Mohsenin, Yi Sheng, Agoritsa Polyzou, Eiman Kanjo, Xiaomin Lin ยท 2026

Deploying foundation models in embodied edge systems is fundamentally a systems problem, not just a problem of model compression. Real-time control must operate within strict size, weight, and power cโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

CorrectionPlanner: Self-Correction Planner with Reinforcement Learning in Autonomous Driving

Yihong Guo, Dongqiangzi Ye, Sijia Chen, Anqi Liu, Xianming Liu ยท 2026

Autonomous driving requires safe planning, but most learning-based planners lack explicit self-correction ability: once an unsafe action is proposed, there is no mechanism to correct it. Thus, we propโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

MA-VLCM: A Vision Language Critic Model for Value Estimation of Policies in Multi-Agent Team Settings

Shahil Shaik, Aditya Parameshwaran, Anshul Nayak, Jonathon M. Smereka, Yue Wang ยท 2026

Multi-agent reinforcement learning (MARL) commonly relies on a centralized critic to estimate the value function. However, learning such a critic from scratch is highly sample-inefficient and often laโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

MoE-ACT: Scaling Multi-Task Bimanual Manipulation with Sparse Language-Conditioned Mixture-of-Experts Transformers

Kangjun Guo, Haichao Liu, Yanji Sun, Ruhan Zhao, Jinni Zhou, Jun Ma ยท 2026

The ability of robots to handle multiple tasks under a unified policy is critical for deploying embodied intelligence in real-world household and industrial applications. However, out-of-distribution โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

HapticVLA: Contact-Rich Manipulation via Vision-Language-Action Model without Inference-Time Tactile Sensing

Konstantin Gubernatorov, Mikhail Sannikov, Ilya Mikhalchuk, Egor Kuznetsov, Makar Artemov, Ogunwoye Faith Ouwatobi, Marcelino Fernando, Artem Asanov, Ziang Guo, Dzmitry Tsetserukou ยท 2026

Tactile sensing is a crucial capability for Vision-Language-Action (VLA) architectures, as it enables dexterous and safe manipulation in contact-rich tasks. However, reliance on dedicated tactile hardโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

NavGSim: High-Fidelity Gaussian Splatting Simulator for Large-Scale Navigation

Jiahang Liu, Yuanxing Duan, Jiazhao Zhang, Minghan Li, Shaoan Wang, Zhizheng Zhang, He Wang ยท 2026

Simulating realistic environments for robots is widely recognized as a critical challenge in robot learning, particularly in terms of rendering and physical simulation. This challenge becomes even morโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

ForceVLA2: Unleashing Hybrid Force-Position Control with Force Awareness for Contact-Rich Manipulation

Yang Li, Zhaxizhuoma, Hongru Jiang, Junjie Xia, Hongquan Zhang, Jinda Du, Yunsong Zhou, Jia Zeng, Ce Hao, Jieji Ren, Qiaojun Yu, Cewu Lu, Yu Qiao, Jiangmiao Pang ยท 2026

Embodied intelligence for contact-rich manipulation has predominantly relied on position control, while explicit awareness and regulation of interaction forces remain under-explored, limiting stabilitโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Vision-Language Model Based Multi-Expert Fusion for CT Image Classification

Jianfa Bai, Kejin Lu, Runtian Yuan, Qingqiu Li, Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng ยท 2026

Robust detection of COVID-19 from chest CT remains challenging in multi-institutional settings due to substantial source shift, source imbalance, and hidden test-source identities. In this work, we prโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Confusion-Aware In-Context-Learning for Vision-Language Models in Robotic Manipulation

Yayun He, Zuheng Kang, Botao Zhao, Zhouyin Wu, Junqing Peng, Jianzong Wang ยท 2026

Vision-language models (VLMs) have significantly improved the generalization capabilities of robotic manipulation. However, VLM-based systems often suffer from a lack of robustness, leading to unprediโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

AeroGrab: A Unified Framework for Aerial Grasping in Cluttered Environments

Shivansh Pratap Singh, Naveen Sudheer Nair, Samaksh Ujjawal, Sarthak Mishra, Soham Patil, Rishabh Dev Yadav, Spandan Roy ยท 2026

Reliable aerial grasping in cluttered environments remains challenging due to occlusions and collision risks. Existing aerial manipulation pipelines largely rely on centroid-based grasping and lack inโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Beam Prediction Based on Multimodal Large Language Models

Tianhao Mao, Le Liang, Jie Yang, Xiao Li, Shi Jin, Geoffrey Ye Li ยท 2026

Accurate beam prediction is a key enabler for next-generation wireless communication systems. In this paper, we propose a multimodal large language model (LLM)-based beam prediction framework that effโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation

Yusuke Takagi, Motonari Kambara, Daichi Yashima, Koki Seno, Kento Tokura, Komei Sugiura ยท 2026

In this study, we address the problem of language-guided robotic manipulation, where a robot is required to manipulate a wide range of objects based on visual observations and natural language instrucโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Learning from Mistakes: Post-Training for Driving VLA with Takeover Data

Yinfeng Gao, Deqing Liu, Qichao Zhang, Yupeng Zheng, Haochen Tian, Guang Li, Hangjun Ye, Long Chen, Da-Wei Ding, Dongbin Zhao ยท 2026

Current Vision-Language-Action (VLA) paradigms in end-to-end autonomous driving rely on offline training from static datasets, leaving them vulnerable to distribution shift. Recent post-training methoโ€ฆ

Read Paper โ†’
โ† Prev Page 23 of 489 Next โ†’