571+ open-access research outputs.
We introduce LRS-VoxMM, an in-the-wild benchmark for audio-visual speech recognition (AVSR). The benchmark is derived from VoxMM, a dataset of diverse real-world spoken conversations with human-annotaโฆ
Annotating long-horizon robotic demonstrations with precise temporal action boundaries is crucial for training and evaluating action segmentation and manipulation policy learning methods. Existing annโฆ
Deploying a neuro-symbolic task planner on a new domain today requires significant manual effort: a domain expert must author relaxation and complementary rules, and hundreds of training problems mustโฆ
In this paper, the quasi-constant modulus (QCM) property is analyzed and leveraged in the design of nonlinearity-tolerant four-dimensional (4D) modulation formats. Accordingly, we propose a family of โฆ
Medical imaging AI development is fundamentally dependent on annotated datasets, yet no existing standard provides machine-enforceable validation across dataset structure, annotation provenance, qualiโฆ
The enhanced Gaussian noise (EGN) model is widely used for estimating the nonlinear interference (NLI) power accumulated in coherent fiber-optic transmission systems. Given a fixed fiber link, under tโฆ
Human-robot collaboration in industrial settings requires precise and reliable communication to enhance operational efficiency. While Large Language Models (LLMs) understand general language, they oftโฆ
Robotic manipulation requires understanding both the 3D spatial structure of the environment and its temporal evolution, yet most existing policies overlook one or both. They typically rely on 2D visuโฆ
With the rise of vision-language models (VLM), their application for autonomous driving (VLM4AD) has gained significant attention. Meanwhile, in autonomous driving, closed-loop evaluation has become wโฆ
The coexistence of heterogeneous cellular standards (2G-5G) in shared spectrum demands sophisticated RF source separation techniques, yet no public dataset exists for data-driven research on this probโฆ
Learning motor control for muscle-driven musculoskeletal models is hindered by the computational cost of biomechanically accurate simulation and the scarcity of validated, open full-body models. Here โฆ
Consumer cameras are ubiquitous in aquatic sciences because they are affordable and easy to use, generating vast collections of underwater imagery for ecosystem surveys, monitoring, mapping, and animaโฆ
Camera pipelines receive raw Bayer-format frames that need to be denoised, demosaiced, and often super-resolved. Multiple frames are captured to utilize natural hand tremors and enhance resolution. Muโฆ
We set out to study whether task-based narratives could influence long-term engagement with a service robot. To do so, we deployed a Robo-Barista for five weeks in an over-50's housing complex in Stocโฆ
We study rigid-body motion planning through multiple sequential narrow openings, which requires long-horizon geometric reasoning because the configuration used to traverse an early opening constrains โฆ
Over the past three decades, countless embodied yet virtual agents have freely evolved inside computer simulations, but vanishingly few were realized as physical robots. This is because evolution was โฆ
Human motion provides rich priors for training general-purpose humanoid control policies, but raw demonstrations are often incompatible with a robot's kinematics and dynamics, limiting their direct usโฆ
This study investigates the effects of nail penetration speed on the safety outcomes of large-format automotive lithium-ion pouch cells. Through six controlled tests varying the speed of nail insertioโฆ
Human-AI joint planning in Unmanned Aerial Vehicles (UAVs) typically relies on control handover when facing environmental uncertainties, which is often inefficient and cognitively demanding for non-exโฆ
Radars provide robust perception of vehicle surroundings by effectively functioning in poor light and adverse weather conditions. Synthetic aperture radar (SAR) algorithms are employed to address the โฆ
Free open-access publishing with Google Scholar indexing.
Submission Guide โ