225+ open-access research outputs.
POSHAN Abhiyan envisages capacity building of AWWs or frontline health workers through 21 training modules of ILA (Incremental Learning Approach), modularising the net learning content into smaller le…
We present AVID, the first large-scale benchmark for audio-visual inconsistency understanding in videos. While omni-modal large language models excel at temporally aligned tasks such as captioning and…
Probabilistic settings (e.g., vanishing-error channel coding) and non-probabilistic settings (e.g., zero-error channel coding and adversarial channels) were considered two related but different branch…
Word error rate (WER) is the dominant metric for automatic speech recognition, yet it cannot detect a systematic failure mode: models that produce fluent output in the wrong writing system. We define …
Dr. David Blackwell was a mathematician and statistician of the first rank, whose contributions to statistical theory, game theory, and decision theory predated many of the algorithmic breakthroughs t…
This paper presents our solution for the DL Sprint 4.0, addressing the dual challenges of Bengali Long-Form Speech Recognition (Task 1) and Speaker Diarization (Task 2). Processing long-form, multi-sp…
Recent advances in automatic speech recognition (ASR) and speech enhancement have led to a widespread assumption that improving perceptual audio quality should directly benefit recognition accuracy. I…
We study curvature-driven edge reweighting for community recovery in the balanced two-block stochastic block model. Given a graph G with initial weights equal to the adjacency matrix, we iteratively u…
Bengali remains a low-resource language in speech technology, especially for complex tasks like long-form transcription and speaker diarization. This paper presents a multistage approach developed for…
Teenagers are avid users of Discord, a fast growing platform for synchronous communication where they often interact with strangers. Because Discord combines private DMs, semi-private voice channels, …
Although Automatic Speech Recognition (ASR) in Bengali has seen significant progress, processing long-duration audio and performing robust speaker diarization remain critical research gaps. To address…
Bengali, despite being one of the most widely spoken languages globally, remains underrepresented in long form speech technology, particularly in systems addressing transcription and speaker attributi…
Bengali (Bangla) remains under-resourced in long-form speech technology despite its wide use. We present Bengali-Loop, two community benchmarks to address this gap: (1) a long-form ASR corpus of 191 r…
In their 1991 paper "Algebraic Reconstruction of Types and Effects," Pierre Jouvelot and David Gifford presented a type-and-effect reconstruction algorithm based on an algebraic structure of effects. …
Indigenous languages face significant cultural oppression from official state languages, particularly in the Global South. We investigate the Bangladeshi Chakma language revitalization movement, a com…
The rapid growth of speech synthesis and voice conversion systems has made deepfake audio a major security concern. Bengali deepfake detection remains largely unexplored. In this work, we study automa…
Recently, Maggiorano et al. (2025) claimed that they have developed a strongly polynomial-time combinatorial algorithm for the nucleolus in convex games that is based on the reduced game approach and …
Artificial intelligence is accelerating a new era of food innovation, connecting data from farm to consumer to improve formulation, processing, and health outcomes. Recent advances in deep learning, n…
Over the past few years, improving LLM code generation capabilities has been a key focus in NLP research. Despite Bengali having 242 million native speakers worldwide, it receives little attention whe…
Accessing legal help in Bangladesh is hard. People face high fees, complex legal language, a shortage of lawyers, and millions of unresolved court cases. Generative AI models like OpenAI GPT-4.1 Mini,…
Free open-access publishing with Google Scholar indexing.
Submission Guide →