16,380+ open-access research outputs.
We introduce LRS-VoxMM, an in-the-wild benchmark for audio-visual speech recognition (AVSR). The benchmark is derived from VoxMM, a dataset of diverse real-world spoken conversations with human-annota…
Accurate localization is a fundamental requirement for autonomous robots operating in indoor environments. Scene graphs encode the spatial structure of an environment as a hierarchy of semantic entiti…
Policy optimization in high-dimensional continuous control for robotics remains a challenging problem. Predominant methods are inherently local and often require extensive tuning and carefully chosen …
We propose a knowledge-driven approach to speech target extraction in the presence of background sound effects already recorded in cinematic audio. The specific knowledge sources studied are manners o…
Nasotracheal intubation (NTI) is a critical clinical procedure for establishing and maintaining patient airway patency. Machine-assisted NTI has emerged as a pivotal approach for optimizing procedural…
Simulating optical tactile sensors presents significant challenges due to their high deformability and intricate optical properties. To address these issues and enable a physically accurate simulation…
Short-term load forecasting for AI data centers presents new challenges because it is computing-driven, with heterogeneous job arrivals, sizes, and durations exhibiting bursty, non-stationary dynamics…
We present the Field of Safe Motion (FSM), a quantitative safety model for determining whether a driver maintains a collision-free escape route, or "out," at any given moment by accounting for that dr…
This paper discusses null-space wrench components in parallel manipulators. We examine the adaptation of the two most common characterizations of these components in grasp-like systems, namely, intera…
This paper presents a planning pipeline framework for locomotion in rope-assisted robots climbing vertical surfaces. The proposed framework is formulated as a bi-level optimization scheme that address…
Automatic feature recognition (AFR) on B-Rep 3D-CAD models is central to CAD/CAM automation, yet most learning-based methods are complex, data-hungry, and evaluate instance grouping and semantic label…
Terahertz (THz) communication has emerged as a key enabler for sixth-generation (6G) networks, offering ultrawide bandwidths to support data-intensive applications such as holographic telepresence and…
The integration of multimodal sensing and millimeter-wave (mmWave) communications is a key enabler for highly mobile vehicle-to-infrastructure (V2I) networks. However, continuous high-resolution visua…
We aim to learn a sparse and connected graph from sparse data, where the number of observations K can be substantially smaller than the signal dimension N for signals x in R^N, and the underlying dist…
Supervised contrastive learning (SupCon) is widely used to shape representations, but has seen limited targeted study for audio deepfake detection. Existing work typically combines contrastive terms w…
With the rapid advancement of computer technologies enabling fast calculations of complex structures, numerical methods have become a central tool in engineering sciences, while physical models have i…
Direction-of-arrival (DOA) estimation is an important task in microphone array processing and many downstream applications. The steered response power with phase transform (SRP-PHAT) method has been w…
The development of intelligent and diversified ser vices in urban rail transit (URT) has resulted in an increasing de mand for high-rate communication between vehicles and ground equipment. However, e…
Low Earth orbit (LEO) satellite relays will significantly extend the coverage of mobile networks, enabling users in remote areas to transmit data of real-time events. Nevertheless, the limited power o…
Weak constitutive fluctuations in dispersive subsurface media can induce distributed clutter that reshapes the observation structure of ground-penetrating radar (GPR). This paper analyzes this effect …
Free open-access publishing with Google Scholar indexing.
Submission Guide →