Gene Cheung in Engineering — Research Repository

Engineering Preprint PDF DOI

DiscreteRTC: Discrete Diffusion Policies are Natural Asynchronous Executors

Pengcheng Wang, Kaiwen Hong, Chensheng Peng, Katherine Driggs-Campbell, Masayoshi Tomizuka, Chenfeng Xu, Chen Tang · 2026

Unlike chatbots, physical AI must act while the world keeps evolving. Therefore, the inter-chunk pause of synchronous executors are fatal for dynamic tasks regardless of how fast the inference is. Asy…

Read Paper →

Engineering Preprint PDF DOI

Beam Scheduling for Cross-Layer ISAC: A Deep Reinforcement Learning Approach

Xiyu Wang, Gilberto Berardinelli, Hei Victor Cheng, Petar Popovski, Ramoni Adeogun · 2026

Resource allocation in integrated sensing and communication (ISAC) systems needs to be optimized to balance the requirements of the communication and sensing modules considering complicated cross-laye…

Read Paper →

Engineering Preprint PDF DOI

An Exponentially stable Extended Kalman Filter with Estimate dependent Process noise Covariance for Chemical Reaction Networks

Suryasnata Dash, Abhishek Dey · 2026

Biomolecular systems are often modeled with partially known nonlinear stochastic dynamics, making state and parameter estimation a central challenge. While Kalman filtering techniques are widely used …

Read Paper →

Engineering Preprint PDF DOI

DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models

Li Li, Ming Cheng, Weixin Zhu, Yannan Wang, Juan Liu, Ming Li · 2026

Multi-speaker automatic speech recognition (ASR) aims to transcribe conversational speech involving multiple speakers, requiring the model to capture not only what was said, but also who said it and s…

Read Paper →

Engineering Preprint PDF DOI

Navigating the Clutter: Waypoint-Based Bi-Level Planning for Multi-Robot Systems

Jiabao Ji, Yongchao Chen, Yang Zhang, Ramana Rao Kompella, Chuchu Fan, Gaowen Liu, Shiyu Chang · 2026

Multi-robot control in cluttered environments is a challenging problem that involves complex physical constraints, including robot-robot collisions, robot-obstacle collisions, and unreachable motions.…

Read Paper →

Engineering Preprint PDF DOI

Efficient Reinforcement Learning using Linear Koopman Dynamics for Nonlinear Robotic Systems

Wenjian Hao, Yuxuan Fang, Zehui Lu, Shaoshuai Mou · 2026

This paper presents a model-based reinforcement learning (RL) framework for optimal closed-loop control of nonlinear robotic systems. The proposed approach learns linear lifted dynamics through Koopma…

Read Paper →

Engineering Preprint PDF DOI

Reducing the Offline-Streaming Gap for Unified ASR Transducer with Consistency Regularization

Andrei Andrusenko, Vladimir Bataev, Lilit Grigoryan, Nune Tadevosyan, Vitaly Lavrukhin, Boris Ginsburg · 2026

Unification of automatic speech recognition (ASR) systems reduces development and maintenance costs, but training a single model to perform well in both offline and low-latency streaming settings rema…

Read Paper →

Engineering Preprint PDF DOI

ST-$\pi$: Structured SpatioTemporal VLA for Robotic Manipulation

Chuanhao Ma, Hanyu Zhou, Shihan Peng, Yan Li, Tao Gu, Luxin Yan · 2026

Vision-language-action (VLA) models have achieved great success on general robotic tasks, but still face challenges in fine-grained spatiotemporal manipulation. Typically, existing methods mainly embe…

Read Paper →

Engineering Preprint PDF DOI

Rewind-IL: Online Failure Detection and State Respawning for Imitation Learning

Gehan Zheng, Sanjay Seenivasan, Matthew Johnson-Roberson, Weiming Zhi · 2026

Imitation learning has enabled robots to acquire complex visuomotor manipulation skills from demonstrations, but deployment failures remain a major obstacle, especially for long-horizon action-chunked…

Read Paper →

Engineering Preprint PDF DOI

Whole-Body Mobile Manipulation using Offline Reinforcement Learning on Sub-optimal Controllers

Snehal Jauhri, Vignesh Prasad, Georgia Chalvatzaki · 2026

Mobile Manipulation (MoMa) of articulated objects, such as opening doors, drawers, and cupboards, demands simultaneous, whole-body coordination between a robot's base and arms. Classical whole-body co…

Read Paper →

Engineering Preprint PDF DOI

AMO-ENE: Attention-based Multi-Omics Fusion Model for Outcome Prediction in Extra Nodal Extension and HPV-associated Oropharyngeal Cancer

Gautier Henique, William Le, Gabriel Dayan, Coralie Brodeur, Kristoff Nelson, Apostolos Christopoulos, Edith Filion, Phuc-Felix Nguyen-Tan, Laurent Letourneau-Guillon, Houda Bahig, Samuel Kadoury · 2026

Extranodal extension (ENE) is an emerging prognostic factor in human papillomavirus (HPV)-associated oropharyngeal cancer (OPC), although it is currently omitted as a clinical staging criteria. Recent…

Read Paper →

Engineering Preprint PDF DOI

Genie Sim PanoRecon: Fast Immersive Scene Generation from Single-View Panorama

Zhijun Li, Yongxin Su, Di Yang, Jichao Wang, Zheyuan Xing, Qian Wang, Maoqing Yao · 2026

We present Genie Sim PanoRecon, a feed-forward Gaussian-splatting pipeline that delivers high-fidelity, low-cost 3D scenes for robotic manipulation simulation. The panorama input is decomposed into si…

Read Paper →

Engineering Preprint PDF DOI

HiPolicy: Hierarchical Multi-Frequency Action Chunking for Policy Learning

Jiyao Zhang, Zimu Han, Junhan Wang, Xionghao Wu, Shihong Lin, Jinzhou Li, Hongwei Fan, Ruihai Wu, Dongjiang Li, Hao Dong · 2026

Robotic imitation learning faces a fundamental trade-off between modeling long-horizon dependencies and enabling fine-grained closed-loop control. Existing fixed-frequency action chunking approaches s…

Read Paper →

Engineering Preprint PDF DOI

Adaptive Action Chunking at Inference-time for Vision-Language-Action Models

Yuanchang Liang, Xiaobo Wang, Kai Wang, Shuo Wang, Xiaojiang Peng, Haoyu Chen, David Kim Huat Chua, Prahlad Vadakkepat · 2026

In Vision-Language-Action (VLA) models, action chunking (i.e., executing a sequence of actions without intermediate replanning) is a key technique to improve robotic manipulation abilities. However, a…

Read Paper →

Engineering Preprint PDF DOI

Build on Priors: Vision--Language--Guided Neuro-Symbolic Imitation Learning for Data-Efficient Real-World Robot Manipulation

Pierrick Lorang, Johannes Huemer, Timothy Duggan, Kai Goebel, Patrik Zips, Matthias Scheutz · 2026

Enabling robots to learn long-horizon manipulation tasks from a handful of demonstrations remains a central challenge in robotics. Existing neuro-symbolic approaches often rely on hand-crafted symboli…

Read Paper →

Engineering Preprint PDF DOI

Open-Loop Planning, Closed-Loop Verification: Speculative Verification for VLA

Zihua Wang, Zhitao Lin, Ruibo Li, Yu Zhang, Xu Yang, Siya Mi, Xiu-Shen Wei · 2026

Vision-Language-Action (VLA) models, as large foundation models for embodied control, have shown strong performance in manipulation tasks. However, their performance comes at high inference cost. To i…

Read Paper →

Engineering Preprint PDF DOI

StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation

Yiran Shi, Dongqi Guo, Tianchen Zhao, Feng Gao, Liangzhi Shi, Chao Yu, ZhiJian Mo, Qihua Xiao, XiaoShuai Peng, Qingmin Liao, Yu Wang · 2026

Vision-language-action (VLA) models have demonstrated exceptional performance in natural language-driven perception and control. However, the high computational cost of VLA models poses significant ef…

Read Paper →

Engineering Preprint PDF DOI

HiFlow: Tokenization-Free Scale-Wise Autoregressive Policy Learning via Flow Matching

Daichi Yashima, Koki Seno, Shuhei Kurita, Yusuke Oda, Komei Sugiura · 2026

Coarse-to-fine autoregressive modeling has recently shown strong promise for visuomotor policy learning, combining the inference efficiency of autoregressive methods with the global trajectory coheren…

Read Paper →

Engineering Preprint PDF DOI

SwarmCoDe: A Scalable Co-Design Framework for Heterogeneous Robot Swarms via Dynamic Speciation

Andrew Wilhelm, Josie Hughes · 2026

Robot swarms offer inherent robustness and the capacity to execute complex, collaborative tasks surpassing the capabilities of single-agent systems. Co-designing these systems is critical, as marginal…

Read Paper →

Engineering Preprint PDF DOI

MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation

Yang Liu, Pengxiang Ding, Tengyue Jiang, Xudong Wang, Wenxuan Song, Minghui Lin, Han Zhao, Hongyin Zhang, Zifeng Zhuang, Wei Zhao, Siteng Huang, Jinkui Shi, Donglin Wang · 2026

Vision-Language-Action (VLA) models aim to control robots for manipulation from visual observations and natural-language instructions. However, existing hierarchical and autoregressive paradigms often…

Read Paper →

Browse Research Papers

DiscreteRTC: Discrete Diffusion Policies are Natural Asynchronous Executors

Beam Scheduling for Cross-Layer ISAC: A Deep Reinforcement Learning Approach

An Exponentially stable Extended Kalman Filter with Estimate dependent Process noise Covariance for Chemical Reaction Networks

DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models

Navigating the Clutter: Waypoint-Based Bi-Level Planning for Multi-Robot Systems

Efficient Reinforcement Learning using Linear Koopman Dynamics for Nonlinear Robotic Systems

Reducing the Offline-Streaming Gap for Unified ASR Transducer with Consistency Regularization

ST-$\pi$: Structured SpatioTemporal VLA for Robotic Manipulation

Rewind-IL: Online Failure Detection and State Respawning for Imitation Learning

Whole-Body Mobile Manipulation using Offline Reinforcement Learning on Sub-optimal Controllers

AMO-ENE: Attention-based Multi-Omics Fusion Model for Outcome Prediction in Extra Nodal Extension and HPV-associated Oropharyngeal Cancer

Genie Sim PanoRecon: Fast Immersive Scene Generation from Single-View Panorama

HiPolicy: Hierarchical Multi-Frequency Action Chunking for Policy Learning

Adaptive Action Chunking at Inference-time for Vision-Language-Action Models

Build on Priors: Vision--Language--Guided Neuro-Symbolic Imitation Learning for Data-Efficient Real-World Robot Manipulation

Open-Loop Planning, Closed-Loop Verification: Speculative Verification for VLA

StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation

HiFlow: Tokenization-Free Scale-Wise Autoregressive Policy Learning via Flow Matching

SwarmCoDe: A Scalable Co-Design Framework for Heterogeneous Robot Swarms via Dynamic Speciation

MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation

Browse by Category

Research Type

Publish Your Research