59+ open-access research outputs.
Financial misinformation poses significant threats to financial market stability and individuals' investment decisions. The multilingual environment and the inherent complexity of financial informatio…
Consumer LiDARs in mobile devices and robots typically output a single depth value per pixel. Yet internally, they record full time-resolved histograms containing direct and multi-bounce light returns…
This paper investigates the robust stability problem of a feedback system in the presence of uncertainties induced by graphical regions in the plane where the scaled relative graphs (SRGs) reside. Our…
There are two major categories of embodied navigation: Vision-Language Navigation (VLN), where agents navigate by following natural language instructions; and Object-Goal Navigation (OGN), where agent…
Temporal task structure is fundamental for bimanual manipulation: a robot must not only know that one action precedes or overlaps another, but also when each action should occur and how long it should…
Ill-posed imaging inverse problems remain challenging due to the ambiguity in mapping degraded observations to clean images. Diffusion-based generative priors have recently shown promise, but typicall…
Audio-visual video highlight detection aims to automatically identify the most salient moments in videos by leveraging both visual and auditory cues. However, existing models often underutilize the au…
Direct Speech-to-Speech Translation (S2ST) has gained increasing attention for its ability to translate speech from one language to another, while reducing error propagation and latency inherent in tr…
Recent advances in singing voice synthesis (SVS) have attracted substantial attention from both academia and industry. With the advent of large language models and novel generative paradigms, producin…
Bengali, spoken by over 300 million people, is a morphologically rich and lowresource language, posing challenges for automatic speech recognition (ASR). This research presents an end-to-end framework…
The development of decentralized stability conditions has gained considerable attention due to the need to analyze multi-agent network systems, such as heterogeneous multi-converter power systems. A r…
This paper presents a unified framework based on Davis-Wielandt (DW) shell for graphical stability analysis of multi-input and multi-output linear time-invariant feedback systems. Connections between …
'The cardinal sin in control is to believe that the plant is given' Karl Astrom. Astrom, a towering figure of control theory and practice and awardee of the 1993 IEEE Medal of Honor for his work on ad…
Accurate classification of second-trimester fetal ultrasound images remains challenging due to low image quality, high intra-class variability, and significant class imbalance. In this work, we introd…
Cone-Beam Computed Tomography (CBCT) is essential in medical imaging, and the Feldkamp-Davis-Kress (FDK) algorithm is a popular choice for reconstruction due to its efficiency. However, FDK is suscept…
Visual navigation, a fundamental challenge in mobile robotics, demands versatile policies to handle diverse environments. Classical methods leverage geometric solutions to minimize specific costs, off…
We present an implicit video representation for occlusions, appearance, and motion disentanglement from monocular videos, which we call Video SPatiotemporal Splines (VideoSPatS). Unlike previous metho…
The fusion of LiDAR and camera sensors has demonstrated significant effectiveness in achieving accurate detection for short-range tasks in autonomous driving. However, this fusion approach could face …
A practical navigation agent must be capable of handling a wide range of interaction demands, such as following instructions, searching objects, answering questions, tracking people, and more. Existin…
This paper introduces $\rho$-NeRF, a self-supervised approach that sets a new standard in novel view synthesis (NVS) and computed tomography (CT) reconstruction by modeling a continuous volumetric rad…
Free open-access publishing with Google Scholar indexing.
Submission Guide →