Laszlo Vidacs in Engineering — Research Repository

Engineering Preprint PDF DOI

Robot Learning from Human Videos: A Survey

Junyi Ma, Erhang Zhang, Haoran Yang, Ditao Li, Chenyang Xu, Guangming Wang, Hesheng Wang · 2026

A critical bottleneck hindering further advancement in embodied AI and robotics is the challenge of scaling robot data. To address this, the field of learning robot manipulation skills from human vide…

Read Paper →

Engineering Preprint PDF DOI

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

Jun Guo, Qiwei Li, Peiyan Li, Zilong Chen, Nan Sun, Yifei Su, Heyun Wang, Yuan Zhang, Xinghang Li, Huaping Liu · 2026

We propose X-WAM, a Unified 4D World Model that unifies real-time robotic action execution and high-fidelity 4D world synthesis (video + 3D reconstruction) in a single framework, addressing the critic…

Read Paper →

Engineering Preprint PDF DOI

Enabling High Error Tolerance in Satellite Video Transmissions by Generative Semantic Communication

Zixin Zhao, Jingzhi Hu, Geoffrey Ye Li · 2026

Low Earth orbit (LEO) satellite relays will significantly extend the coverage of mobile networks, enabling users in remote areas to transmit data of real-time events. Nevertheless, the limited power o…

Read Paper →

Engineering Preprint PDF DOI

Learning Human-Intention Priors from Large-Scale Human Demonstrations for Robotic Manipulation

Yifan Xie, YuAn Wang, Guangyu Chen, Jinkun Liu, Yu Sun, Wenbo Ding · 2026

Human videos contain rich manipulation priors, but using them for robot learning remains difficult because raw observations entangle scene understanding, human motion, and embodiment-specific action. …

Read Paper →

Engineering Preprint PDF DOI

A Realistic Discrete Event Simulation model for Ambulance Location and Deployment within a regional Emergency Medical Service

Alberto De Santis, Stefania Iannazzo, Fabio Ingravalle, Stefano Lucidi, Massimo Maurici, Giulia Riccardi, Massimo Roma, Antonio Vinci · 2026

The objective of Emergency Medical Services (EMSs) is to promptly respond to calls from citizens for first aid, providing pre-hospital care and, if necessary, to transfer patients to an appropriate Em…

Read Paper →

Engineering Preprint PDF DOI

BridgeACT: Bridging Human Demonstrations to Robot Actions via Unified Tool-Target Affordances

Yifan Han, Jianxiang Liu, Haoyu Zhang, Yuqi Gu, Yunhan Guo, Wenzhao Lian · 2026

Learning robot manipulation from human videos is appealing due to the scale and diversity of human demonstrations, but transferring such demonstrations to executable robot behavior remains challenging…

Read Paper →

Engineering Preprint PDF DOI

GCImOpt: Learning efficient goal-conditioned policies by imitating optimal trajectories

Jon Goikoetxea, Jesus F. Palacian · 2026

Imitation learning is a well-established approach for machine-learning-based control. However, its applicability depends on having access to demonstrations, which are often expensive to collect and/or…

Read Paper →

Engineering Preprint PDF DOI

MTT-Bench: Predicting Social Dominance in Mice via Multimodal Large Language Models

Yunquan Chen, Haoyu Chen · 2026

Understanding social dominance in animal behavior is critical for neuroscience and behavioral studies. In this work, we explore the capability of Multimodal Large Language Models(MLLMs) to analyze raw…

Read Paper →

Engineering Preprint PDF DOI

Energy-Efficient Multi-Robot Coverage Path Planning of Non-Convex Regions of Interests

Sourav Raxit, Jose Fuentes, Paulo Padrao, Abdullah Al Redwan Newaz, Md Tamjidul Hoque, Mark Kulp, Leonardo Bobadilla · 2026

This letter presents an energy-efficient multi-robot coverage path planning (MRCPP) framework for large, nonconvex Regions of Interest (ROI) containing obstacles and no-fly zones (NFZ). Existing minim…

Read Paper →

Engineering Preprint PDF DOI

Complex Approximate Message Passing with Non-separable Denoising

Vishnu Teja Kunde, Alessandro Mirri, Jean-Francois Chamberland, Enrico Paolini · 2026

Approximate Message Passing (AMP) is a general framework for iterative algorithms, originally developed for compressed sensing and later extended to a wide range of high-dimensional inference problems…

Read Paper →

Engineering Preprint PDF DOI

FingerEye: Continuous and Unified Vision-Tactile Sensing for Dexterous Manipulation

Zhixuan Xu, Yichen Li, Xuanye Wu, Tianyu Qiu, Lin Shao · 2026

Dexterous robotic manipulation requires comprehensive perception across all phases of interaction: pre-contact, contact initiation, and post-contact. Such continuous feedback allows a robot to adapt i…

Read Paper →

Engineering Preprint PDF DOI

Predicting food taste with bound-driven optimization

Pagkratis Tagkopoulos, Dimitris Sfondilis, Ilias Tagkopoulos, Tarek Zohdi · 2026

The prediction of sensory attributes from ingredient-level formulations is an emerging challenge at the intersection of food science and artificial intelligence. We address the fundamental question of…

Read Paper →

Engineering Preprint PDF DOI

JoyAI-RA 0.1: A Foundation Model for Robotic Autonomy

Tianle Zhang, Zhihao Yuan, Dafeng Chi, Peidong Liu, Dongwei Li, Kejun Hu, Likui Zhang, Junnan Nie, Ziming Wei, Zengjue Chen, Yili Tang, Jiayi Li, Zhiyuan Xiang, Mingyang Li, Tianci Luo, Hanwen Wan, Ao Li, Linbo Zhai, Zhihao Zhan, Xiaodong Bai, Jiakun Cai, Peng Cao, Kangliang Chen, Siang Chen, Yixiang Dai, Shuai Di, Yicheng Gong, Chenguang Gui, Yucheng Guo, Peng Hao, Qingrong He, Haoyang Huang, Kunrui Huang, Zhixuan Huang, Shibo Jin, Yixiang Jin, Anson Li, Dongjiang Li, Jiawei Li, Ruodai Li, Yihang Li, Yuzhen Li, Jiaming Liang, Fangsheng Liu, Jing Long, Mingxi Luo, Xing Pan, Hui Shen, Xiaomeng Tian, Daming Wang, Song Wang, Junwu Xiong, Hang Xu, Wanting Xu, Zhengcheng Yu, He Zhang, Jiyao Zhang, Lin Zhao, Chen Zhou, Nan Duan, Yuzheng Zhuang, Liang Lin · 2026

Robotic autonomy in open-world environments is fundamentally limited by insufficient data diversity and poor cross-embodiment generalization. Existing robotic datasets are often limited in scale and t…

Read Paper →

Engineering Preprint PDF DOI

VLA Foundry: A Unified Framework for Training Vision-Language-Action Models

Jean Mercat, Sedrick Keh, Kushal Arora, Isabella Huang, Paarth Shah, Haruki Nishimura, Shun Iwase, Katherine Liu · 2026

We present VLA Foundry, an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. Most open-source VLA efforts specialize on the action training stage, often stitching tog…

Read Paper →

Engineering Preprint PDF DOI

RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation

Feng Jiang, Yang Chen, Kyle Xu, Yuchen Liu, Haifeng Wang, Zhenhao Shen, Jasper Lu, Shengze Huang, Yuanfei Wang, Chen Xie, Ruihai Wu · 2026

Recent advances in large-scale video world models have enabled increasingly realistic future prediction, raising the prospect of leveraging imagined videos for robot learning. However, visual realism …

Read Paper →

Engineering Preprint PDF DOI

Will People Enjoy a Robot Trainer? A Case Study with Snoopie the Pacerbot

Maximilian Du, Jennifer Grannen, Shuran Song, Dorsa Sadigh · 2026

The physicality of exercise makes the role of athletic trainers unique. Their physical presence allows them to guide a student through a motion, demonstrate an exercise, and give intuitive feedback. R…

Read Paper →

Engineering Preprint PDF DOI

RSMA-Aided Full-Duplex Networks Under Imperfect CSI and SIC: Performance Evaluation

Farjam Karim, Nurul Huda Mahmood, Deepak Kumar, Arthur Sousa de Sena, Matti-Latva-aho · 2026

This work investigates a full-duplex (FD)-enhanced Rate-Splitting Multiple Access (RSMA) system under practical constraints, including imperfect channel state information (CSI) and successive interfer…

Read Paper →

Engineering Preprint PDF DOI

VIDS: A Verified Imaging Dataset Standard for Medical AI

Joan S. Muthu, John Shalen · 2026

Medical imaging AI development is fundamentally dependent on annotated datasets, yet no existing standard provides machine-enforceable validation across dataset structure, annotation provenance, quali…

Read Paper →

Engineering Preprint PDF DOI

Watching Physics: the Generative Science of Matter and Motion

Hagen Holthusen, Kevin Linka, Ellen Kuhl · 2026

Can we learn the physics of matter in motion directly from images and video--and trust it? Answering this question requires integrating experiments, physics-based simulation, and data across tradition…

Read Paper →

Engineering Preprint PDF DOI

DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs

Nikhil Behari, Diego Rivero, Luke Apostolides, Suman Ghosh, Paul Pu Liang, Ramesh Raskar · 2026

Consumer LiDARs in mobile devices and robots typically output a single depth value per pixel. Yet internally, they record full time-resolved histograms containing direct and multi-bounce light returns…

Read Paper →

Browse Research Papers

Robot Learning from Human Videos: A Survey

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

Enabling High Error Tolerance in Satellite Video Transmissions by Generative Semantic Communication

Learning Human-Intention Priors from Large-Scale Human Demonstrations for Robotic Manipulation

A Realistic Discrete Event Simulation model for Ambulance Location and Deployment within a regional Emergency Medical Service

BridgeACT: Bridging Human Demonstrations to Robot Actions via Unified Tool-Target Affordances

GCImOpt: Learning efficient goal-conditioned policies by imitating optimal trajectories

MTT-Bench: Predicting Social Dominance in Mice via Multimodal Large Language Models

Energy-Efficient Multi-Robot Coverage Path Planning of Non-Convex Regions of Interests

Complex Approximate Message Passing with Non-separable Denoising

FingerEye: Continuous and Unified Vision-Tactile Sensing for Dexterous Manipulation

Predicting food taste with bound-driven optimization

JoyAI-RA 0.1: A Foundation Model for Robotic Autonomy

VLA Foundry: A Unified Framework for Training Vision-Language-Action Models

RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation

Will People Enjoy a Robot Trainer? A Case Study with Snoopie the Pacerbot

RSMA-Aided Full-Duplex Networks Under Imperfect CSI and SIC: Performance Evaluation

VIDS: A Verified Imaging Dataset Standard for Medical AI

Watching Physics: the Generative Science of Matter and Motion

DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs

Browse by Category

Research Type

Publish Your Research