David Bengali in Engineering — Research Repository

Engineering Preprint PDF DOI

MFMDQwen: Multilingual Financial Misinformation Detection Based on Large Language Model

Zhiwei Liu, Yuyan Wang, Yuechen Jiang, Yupeng Cao, Tianlei Zhu, Xiaorui Guo, Zhiyang Deng, Zhiyuan Yao, Xiao-Yang Liu, Jimin Huang, Sophia Ananiadou · 2026

Financial misinformation poses significant threats to financial market stability and individuals' investment decisions. The multilingual environment and the inherent complexity of financial informatio…

Read Paper →

Engineering Preprint PDF DOI

DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs

Nikhil Behari, Diego Rivero, Luke Apostolides, Suman Ghosh, Paul Pu Liang, Ramesh Raskar · 2026

Consumer LiDARs in mobile devices and robots typically output a single depth value per pixel. Yet internally, they record full time-resolved histograms containing direct and multi-bounce light returns…

Read Paper →

Engineering Preprint PDF DOI

Symmetry Is Almost All You Need: Robust Stability with Uncertainty Induced by Symmetric SRG Regions

Ding Zhang, Di Zhao, Philipp Braun, Jianqi Chen · 2026

This paper investigates the robust stability problem of a feedback system in the presence of uncertainties induced by graphical regions in the plane where the scaled relative graphs (SRGs) reside. Our…

Read Paper →

Engineering Preprint PDF DOI

NavTrust: Benchmarking Trustworthiness for Embodied Navigation

Huaide Jiang, Yash Chaudhary, Yuping Wang, Zehao Wang, Raghav Sharma, Manan Mehta, Yang Zhou, Lichao Sun, Zhiwen Fan, Zhengzhong Tu, Jiachen Li · 2026

There are two major categories of embodied navigation: Vision-Language Navigation (VLN), where agents navigate by following natural language instructions; and Object-Goal Navigation (OGN), where agent…

Read Paper →

Engineering Preprint PDF DOI

Unified Learning of Temporal Task Structure and Action Timing for Bimanual Robot Manipulation

Christian Dreher, Patrick Dormanns, Andre Meixner, Tamim Asfour · 2026

Temporal task structure is fundamental for bimanual manipulation: a robot must not only know that one action precedes or overlaps another, but also when each action should occur and how long it should…

Read Paper →

Engineering Preprint PDF DOI

U-DAVI: Uncertainty-Aware Diffusion-Prior-Based Amortized Variational Inference for Image Reconstruction

Ayush Varshney, Katherine L. Bouman, Berthy T. Feng · 2026

Ill-posed imaging inverse problems remain challenging due to the ambiguity in mapping degraded observations to clean images. Diffusion-based generative priors have recently shown promise, but typicall…

Read Paper →

Engineering Preprint PDF DOI

Sounding Highlights: Dual-Pathway Audio Encoders for Audio-Visual Video Highlight Detection

Seohyun Joo, Yoori Oh · 2026

Audio-visual video highlight detection aims to automatically identify the most salient moments in videos by leveraging both visual and auditory cues. However, existing models often underutilize the au…

Read Paper →

Engineering Preprint PDF DOI

Timbre-Aware LLM-based Direct Speech-to-Speech Translation Extendable to Multiple Language Pairs

Lalaram Arya, Mrinmoy Bhattacharjee, Adarsh C. R., S. R. Mahadeva Prasanna · 2026

Direct Speech-to-Speech Translation (S2ST) has gained increasing attention for its ability to translate speech from one language to another, while reducing error propagation and latency inherent in tr…

Read Paper →

Engineering Preprint PDF DOI

Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches

Changhao Pan, Dongyu Yao, Yu Zhang, Wenxiang Guo, Jingyu Lu, Zhiyuan Zhu, Zhou Zhao · 2026

Recent advances in singing voice synthesis (SVS) have attracted substantial attention from both academia and industry. With the advent of large language models and novel generative paradigms, producin…

Read Paper →

Engineering Preprint PDF DOI

Multi-Level Embedding Conformer Framework for Bengali Automatic Speech Recognition

Md. Nazmus Sakib, Golam Mahmud, Md. Maruf Bangabashi, Umme Ara Mahinur Istia, Md. Jahidul Islam, Partha Sarker, Afra Yeamini Prity · 2025

Bengali, spoken by over 300 million people, is a morphologically rich and lowresource language, posing challenges for automatic speech recognition (ASR). This research presents an end-to-end framework…

Read Paper →

Engineering Preprint PDF DOI

Geometric Decentralized Stability Certificate for Power Systems Based on Projecting DW Shells

Linbin Huang, Liangxiao Luo, Ruohan Leng, Huanhai Xin, Dan Wang, Florian Dorfler · 2025

The development of decentralized stability conditions has gained considerable attention due to the need to analyze multi-agent network systems, such as heterogeneous multi-converter power systems. A r…

Read Paper →

Engineering Preprint PDF DOI

The Phantom of Davis-Wielandt Shell: A Unified Framework for Graphical Stability Analysis of MIMO LTI Systems

Ding Zhang, Xiaokan Yang, Axel Ringh, Li Qiu · 2025

This paper presents a unified framework based on Davis-Wielandt (DW) shell for graphical stability analysis of multi-input and multi-output linear time-invariant feedback systems. Connections between …

Read Paper →

Engineering Preprint PDF DOI

The Milieu, Science & Logic of Feedback Control

Robert R. Bitmead · 2025

'The cardinal sin in control is to believe that the plant is given' Karl Astrom. Astrom, a towering figure of control theory and practice and awardee of the 1993 IEEE Medal of Honor for his work on ad…

Read Paper →

Engineering Preprint PDF DOI

Biologically Inspired Deep Learning Approaches for Fetal Ultrasound Image Classification

Rinat Prochii, Elizaveta Dakhova, Pavel Birulin, Maxim Sharaev · 2025

Accurate classification of second-trimester fetal ultrasound images remains challenging due to low image quality, high intra-class variability, and significant class imbalance. In this work, we introd…

Read Paper →

Engineering Preprint PDF DOI

Learning Wavelet-Sparse FDK for 3D Cone-Beam CT Reconstruction

Yipeng Sun, Linda-Sophie Schneider, Chengze Ye, Mingxuan Gu, Siyuan Mei, Siming Bayer, Andreas Maier · 2025

Cone-Beam Computed Tomography (CBCT) is essential in medical imaging, and the Feldkamp-Davis-Kress (FDK) algorithm is a popular choice for reconstruction due to its efficiency. However, FDK is suscept…

Read Paper →

Engineering Preprint PDF DOI

NaviDiffusor: Cost-Guided Diffusion Model for Visual Navigation

Yiming Zeng, Hao Ren, Shuhang Wang, Junlong Huang, Hui Cheng · 2025

Visual navigation, a fundamental challenge in mobile robotics, demands versatile policies to handle diverse environments. Classical methods leverage geometric solutions to minimize specific costs, off…

Read Paper →

Engineering Preprint PDF DOI

VideoSPatS: Video SPatiotemporal Splines for Disentangled Occlusion, Appearance and Motion Modeling and Editing

Juan Luis Gonzalez Bello, Xu Yao, Alex Whelan, Kyle Olszewski, Hyeongwoo Kim, Pablo Garrido · 2025

We present an implicit video representation for occlusions, appearance, and motion disentanglement from monocular videos, which we call Video SPatiotemporal Splines (VideoSPatS). Unlike previous metho…

Read Paper →

Engineering Preprint PDF DOI

SaViD: Spectravista Aesthetic Vision Integration for Robust and Discerning 3D Object Detection in Challenging Environments

Tanmoy Dam, Sanjay Bhargav Dharavath, Sameer Alam, Nimrod Lilith, Aniruddha Maiti, Supriyo Chakraborty, Mir Feroskhan · 2025

The fusion of LiDAR and camera sensors has demonstrated significant effectiveness in achieving accurate detection for short-range tasks in autonomous driving. However, this fusion approach could face …

Read Paper →

Engineering Preprint PDF DOI

Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks

Jiazhao Zhang, Kunyu Wang, Shaoan Wang, Minghan Li, Haoran Liu, Songlin Wei, Zhongyuan Wang, Zhizheng Zhang, He Wang · 2024

A practical navigation agent must be capable of handling a wide range of interaction demands, such as following instructions, searching objects, answering questions, tracking people, and more. Existin…

Read Paper →

Engineering Preprint PDF DOI

$\rho$-NeRF: Leveraging Attenuation Priors in Neural Radiance Field for 3D Computed Tomography Reconstruction

Li Zhou, Changsheng Fang, Bahareh Morovati, Yongtong Liu, Shuo Han, Yongshun Xu, Hengyong Yu · 2024

This paper introduces $\rho$-NeRF, a self-supervised approach that sets a new standard in novel view synthesis (NVS) and computed tomography (CT) reconstruction by modeling a continuous volumetric rad…

Read Paper →

Browse Research Papers

MFMDQwen: Multilingual Financial Misinformation Detection Based on Large Language Model

DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs

Symmetry Is Almost All You Need: Robust Stability with Uncertainty Induced by Symmetric SRG Regions

NavTrust: Benchmarking Trustworthiness for Embodied Navigation

Unified Learning of Temporal Task Structure and Action Timing for Bimanual Robot Manipulation

U-DAVI: Uncertainty-Aware Diffusion-Prior-Based Amortized Variational Inference for Image Reconstruction

Sounding Highlights: Dual-Pathway Audio Encoders for Audio-Visual Video Highlight Detection

Timbre-Aware LLM-based Direct Speech-to-Speech Translation Extendable to Multiple Language Pairs

Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches

Multi-Level Embedding Conformer Framework for Bengali Automatic Speech Recognition

Geometric Decentralized Stability Certificate for Power Systems Based on Projecting DW Shells

The Phantom of Davis-Wielandt Shell: A Unified Framework for Graphical Stability Analysis of MIMO LTI Systems

The Milieu, Science & Logic of Feedback Control

Biologically Inspired Deep Learning Approaches for Fetal Ultrasound Image Classification

Learning Wavelet-Sparse FDK for 3D Cone-Beam CT Reconstruction

NaviDiffusor: Cost-Guided Diffusion Model for Visual Navigation

VideoSPatS: Video SPatiotemporal Splines for Disentangled Occlusion, Appearance and Motion Modeling and Editing

SaViD: Spectravista Aesthetic Vision Integration for Robust and Discerning 3D Object Detection in Challenging Environments

Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks

$\rho$-NeRF: Leveraging Attenuation Priors in Neural Radiance Field for 3D Computed Tomography Reconstruction

Browse by Category

Research Type

Publish Your Research