3,601+ open-access research outputs.
We present LatentAM, an online 3D Gaussian Splatting (3DGS) mapping framework that builds scalable latent feature maps from streaming RGB-D observations for open-vocabulary robotic perception. Insteadโฆ
Object navigation is a core capability of embodied intelligence, enabling an agent to locate target objects in unknown environments. Recent advances in vision-language models (VLMs) have facilitated zโฆ
Efficient motion planning for high-dimensional robotic systems, such as manipulators and mobile manipulators, is critical for real-time operation and reliable deployment. Although advances in planningโฆ
Recent advances in vision-language models have made zero-shot navigation feasible, enabling robots to follow natural language instructions without requiring labeling. However, existing methods that exโฆ
Embodied navigation has long been fragmented by task-specific architectures. We introduce ABot-N0, a unified Vision-Language-Action (VLA) foundation model that achieves a ``Grand Unification'' across โฆ
This work investigates bidirectional Mamba (BiMamba) for unified streaming and non-streaming automatic speech recognition (ASR). Dynamic chunk size training enables a single model for offline decodingโฆ
We present SceneVGGT, a spatio-temporal 3D scene understanding framework that combines SLAM with semantic mapping for autonomous and assistive navigation. Built on VGGT, our method scales to long videโฆ
During multi-party interactions, gaze direction is a key indicator of interest and intent, making it essential for social robots to direct their attention appropriately. Understanding the social conteโฆ
While large vision-language models (VLMs) show promise for object goal navigation, current methods still struggle with low success rates and inefficient localization of unseen objects--failures primarโฆ
Real-time voice conversion and speaker anonymization require causal, low-latency synthesis without sacrificing intelligibility or naturalness. Current systems have a core representational mismatch: coโฆ
Mobile robots are often deployed over long durations in diverse open, dynamic scenes, including indoor setting such as warehouses and manufacturing facilities, and outdoor settings such as agriculturaโฆ
Semantic communications (SemComs) have been considered as a promising solution to reduce the amount of transmitted information, thus paving the way for more energy-and spectrum-efficient wireless netwโฆ
Flow-based generative models have emerged as powerful priors for solving inverse problems. One option is to directly optimize the initial latent code (noise), such that the flow output solves the inveโฆ
In millimeter wave integrated sensing and communication (ISAC) systems for intelligent transportation, radar and communication share spectrum and hardware in a time division manner. Radar rapidly deteโฆ
Planning under uncertainty for real-world robotics tasks, such as autonomous driving, requires reasoning in enormous high-dimensional belief spaces, rendering the problem computationally intensive. Whโฆ
Current Vision-Language-Action (VLA) models rely on fixed computational depth, expending the same amount of compute on simple adjustments and complex multi-step manipulation. While Chain-of-Thought (Cโฆ
With the rising prevalence of cardiovascular diseases, electrocardiograms (ECG) remain essential for the non-invasive detection of cardiac abnormalities. This study presents a comprehensive evaluationโฆ
High Dynamic Range (HDR) video reconstruction aims to recover fine brightness, color, and details from Low Dynamic Range (LDR) videos. However, existing methods often suffer from color inaccuracies anโฆ
Fast and differentiable solvers for anisotropic and asymmetric distance fields are a key primitive in geometry processing, enabling gradient-based optimization over metrics, drift fields, and downstreโฆ
Future wireless networks demand increasingly powerful intelligence to support sensing, communication, and autonomous decision-making. While scaling laws suggest improving performance by enlarging modeโฆ
Free open-access publishing with Google Scholar indexing.
Submission Guide โ