2,004+ open-access research outputs.
Multi-speaker automatic speech recognition (ASR) aims to transcribe conversational speech involving multiple speakers, requiring the model to capture not only what was said, but also who said it and sโฆ
Speaker diarization (SD) is the task of answering "who spoke when" in a multi-speaker audio stream. Classically, an SD system clusters segments of speech belonging to an individual speaker's identity.โฆ
Large language models are increasingly being explored as interfaces between humans and robotic systems, yet there remains limited evidence on how such technologies can be used not only for interactionโฆ
Modern world models are becoming too complex to admit explicit dynamical descriptions. We study safety-critical contextual control, where a Planner must optimize a task objective using only feasibilitโฆ
Low-altitude communication networks (LACNs) serve as the critical infrastructure of the emerging low-altitude economy (LAE), supporting services such as drone delivery and infrastructure inspection. Hโฆ
Medical imaging AI development is fundamentally dependent on annotated datasets, yet no existing standard provides machine-enforceable validation across dataset structure, annotation provenance, qualiโฆ
Ultrasound imaging tasks such as calibration, inverse parameter estimation, and acquisition design require models that are physically grounded, efficient, and differentiable with respect to meaningfulโฆ
This study investigates whether speech-based depression detection models learn depression-related acoustic biomarkers or instead rely on speaker identity cues. Using the DAIC-WOZ dataset, we propose aโฆ
Oil spills represent a severe threat, making early-stage thickness estimation crucial for guiding remediation efforts. Unmanned Aerial Vehicles (UAVs) are an attractive platform for environmental moniโฆ
Learner satisfaction is a critical quality signal in massive open online courses (MOOCs), directly influencing retention, engagement, and platform reputation. Most existing methods infer satisfaction โฆ
Autonomous vehicles are increasingly deployed in safety-critical applications, where sensing failures or cyberphysical attacks can lead to unsafe operations resulting in human loss and/or severe physiโฆ
Speech enhancement (SE) is critical for improving speech intelligibility and quality in real-world environments, particularly for cochlear implant (CI) users who experience severe degradations in speeโฆ
This paper studies cyber attacks against informativity-based analysis in data-driven control. Focusing on strong observability, we consider an adversary who post-processes finite time-series data by aโฆ
Prediction markets are starting to look less like crowd polls and more like electronic markets. The central question is therefore no longer only whether these markets forecast well, but what happens wโฆ
Large-scale Electric Vehicle (EV) Charging Station (CS) may be too large to be dispatched in real-time via a centralized approach. While a decentralized approach may be a viable solution, the lack of โฆ
Network coordination games are widely used to model collaboration among interconnected agents, with applications across diverse domains including economics, robotics, and cyber-security. We consider nโฆ
Motivated by the need to develop fair and efficient schemes to facilitate the electrification of transport, this paper proposes a non-monetary karma economy for flexible Electric Vehicle (EV) chargingโฆ
The human auditory system has the ability to selectively focus on key speech elements in an audio stream while giving secondary attention to less relevant areas such as noise or distortion within the โฆ
Evaluation of robotic manipulation systems has largely relied on fixed benchmarks authored by a small number of experts, where task instances, constraints, and success criteria are predefined and diffโฆ
The rapid advancement of robotics, spanning expanded capabilities, more intuitive interaction, and more integration into real-world workflows, is reshaping what it means for humans and robots to coexiโฆ
Free open-access publishing with Google Scholar indexing.
Submission Guide โ