14,737+ open-access research outputs.
Vision-Language-Action (VLA) models have gained much attention from the research community thanks to their strength in translating multimodal observations with linguistic instructions into desired robโฆ
Low-altitude communication networks (LACNs) serve as the critical infrastructure of the emerging low-altitude economy (LAE), supporting services such as drone delivery and infrastructure inspection. Hโฆ
Visual-Language-Action (VLA) models represent a paradigm shift in embodied AI, yet existing frameworks often struggle with imprecise spatial perception, suboptimal multimodal fusion, and instability iโฆ
Being one of the oldest and most basic problems in image processing, image denoising has seen a resurgence spurred by rapid advances in deep learning. Yet, most modern denoising architectures make limโฆ
Image-goal navigation steers an agent to a target location specified by an image in unseen environments. Existing methods primarily handle this task by learning an end-to-end navigation policy, which โฆ
Whole-body humanoid locomotion is challenging due to high-dimensional control, morphological instability, and the need for real-time adaptation to various terrains using onboard perception. Directly aโฆ
Deploying a humanoid robot to manipulate a new object has traditionally required one to two days of effort: data collection, manual annotation, 3D model acquisition, and model training. This paper preโฆ
Implicit spatial relations and deep semantic structures encoded in object attributes are crucial for procedural planning in embodied AI systems. However, existing approaches often over rely on the reaโฆ
Near Field Communication (NFC) cards are widely used for identification, but their passive nature often limits the ability to incorporate additional security mechanisms. As a result, anyone holding thโฆ
Collecting embodied interaction data at scale remains costly and difficult due to the limited accessibility of conventional interfaces. We present a gamified data collection framework based on Unity tโฆ
Generalist embodied agents must perform interactive, causally-dependent reasoning, continually interacting with the environment, acquiring information, and updating plans to solve long-horizon tasks bโฆ
Can we learn the physics of matter in motion directly from images and video--and trust it? Answering this question requires integrating experiments, physics-based simulation, and data across traditionโฆ
Vision-language-action (VLA) models have emerged as generalist robotic controllers capable of mapping visual observations and natural language instructions to continuous action sequences. However, VLAโฆ
This comprehensive report distinguishes prior works by the cognitive functions they innovate. Many works claim an almost "human-like" cognitive capability in their world models. To evaluate these claiโฆ
Consumer LiDARs in mobile devices and robots typically output a single depth value per pixel. Yet internally, they record full time-resolved histograms containing direct and multi-bounce light returnsโฆ
Lung cancer remains one of the leading causes of cancer-related mortality worldwide. Conventional computed tomography (CT) imaging, while essential for detection and staging, has limitations in distinโฆ
High-precision indoor sensing using monostatic multiple-input multiple-output (MIMO) radar typically relies on increasing the physical aperture size of antennas, leading to high hardware complexity anโฆ
Diffusion policies are becoming mainstream in robotic manipulation but suffer from hard negative class imbalance due to uniform sampling and lack of sample difficulty awareness, leading to slow trainiโฆ
Quantitative speed-of-sound (SoS) and attenuation of tissues are closely related to pathology; however, conventional B-mode images are limited to qualitative visualization. Existing ultrasound full-waโฆ
Fungal protein materials exhibit inherently anisotropic microstructures formed by networks of hyphae, which suggest a natural pathway to replicate the fibrous texture of animal meat. We probe whether โฆ
Free open-access publishing with Google Scholar indexing.
Submission Guide โ