41,527+ open-access research outputs.
Loco-manipulation is a key capability for legged robots to perform practical mobile manipulation tasks, such as transporting and pushing objects, in real-world environments. However, learning robust lโฆ
Vision-language-action (VLA) models have shown great potential in building generalist robots, but still face a dilemma-misalignment of 2D image forecasting and 3D action prediction. Besides, such a viโฆ
Drones equipped with overhead manipulators offer unique capabilities for inspection, maintenance, and contact-based interaction. However, the motion of the drone and its manipulator is tightly linked,โฆ
Programming robots to perform complex tasks is often difficult and time consuming, requiring expert knowledge and skills in robot software and sometimes hardware. Imitation learning is a method for trโฆ
The prevailing paradigm for image-goal visual navigation often assumes access to large-scale datasets, substantial pretraining, and significant computational resources. In this work, we challenge thisโฆ
Task-oriented grasping (TOG) is more challenging than simple object grasping because it requires precise identification of object parts and careful selection of grasping areas to ensure effective and โฆ
Deformable image registration remains a central challenge in medical image analysis, particularly under multi-modal scenarios where intensity distributions vary significantly across scans. While deep โฆ
In deployment of the VLA models to real-world robotic tasks, execution speed matters. In previous work arXiv:2510.26742 we analyze how to make neural computation of VLAs on GPU fast. However, we leaveโฆ
Efficiently predicting motion plans directly from vision remains a fundamental challenge in robotics, where planning typically requires explicit goal specification and task-specific design. Recent visโฆ
Vision--Language--Action (VLA) models that encode actions using a discrete tokenization scheme are increasingly adopted for robotic manipulation, but existing decoding paradigms remain fundamentally lโฆ
Markerless biomechanics increasingly relies on 3D skeletal keypoints extracted from video, yet downstream biomechanical mappings typically treat these estimates as deterministic, providing no principlโฆ
Remote photoplethysmography (rPPG) allows for the contactless estimation of physiological signals from facial videos by analyzing subtle skin color changes. However, rPPG signals are extremely susceptโฆ
Echo-planar imaging (EPI) remains the cornerstone of diffusion MRI, but it is prone to severe geometric distortions due to its rapid sampling scheme that renders the sequence highly sensitive to $B_{0โฆ
Heating, ventilation, and air conditioning (HVAC) systems account for a substantial share of building energy consumption. Environmental uncertainty and dynamic occupancy behavior bring challenges in dโฆ
Cone-beam computed tomography (CBCT) images are problematic in clinical medicine because of their low contrast and high artifact content compared with conventional CT images. Although there are some sโฆ
Accurate risk assessment is essential for safety-critical autonomous and control systems under uncertainty. In many real-world settings, stochastic dynamics exhibit asymmetric jumps and long-range memโฆ
Fault detection and identification (FDI) is critical for maintaining the safety and reliability of systems subject to actuator and sensor faults. In this paper, the problem of FDI for nonlinear controโฆ
Storm surge is a significant threat to coastal communities across the globe, responsible for loss of life and enormous property damage. Consequently, significant efforts have been expended to develop โฆ
Visual Navigation Models (VNMs) promise generalizable, robot navigation by learning from large-scale visual demonstrations. Despite growing real-world deployment, existing evaluations rely almost exclโฆ
Scaling robot learning to long-horizon tasks remains a formidable challenge. While end-to-end policies often lack the structural priors needed for effective long-term reasoning, traditional neuro-symbโฆ
Free open-access publishing with Google Scholar indexing.
Submission Guide โ