Mohammad Dashti — Research Repository

AI & Data Science Preprint PDF DOI

DASH-KV: Accelerating Long-Context LLM Inference via Asymmetric KV Cache Hashing

Jinyu Guo, Zhihan Zhang, Yutong Li, Jiehui Xie, Md. Tamim Iqbal, Dongshen Han, Lik-Hang Lee, Sung-Ho Bae, Jie Zou, Yang Yang, Chaoning Zhang · 2026

The quadratic computational complexity of the standard attention mechanism constitutes a fundamental bottleneck for large language models in long-context inference. While existing KV cache compression…

Read Paper →

Computer Science Preprint PDF DOI

End-to-End Performance of Video Streaming With MPEG-DASH Over Satellite 5G IAB Networks

Muhammad Adeel Zahid, Ekram Hossain, Peng Hu · 2026

We present an end-to-end performance evaluation of MPEG-DASH video streaming over a Low-Earth Orbit (LEO) satellite-based 5G Integrated Access and Backhaul (IAB) network. Our objective is to investiga…

Read Paper →

Computer Science Preprint PDF DOI

Bandwidth Cost of Locally Repairable Convertible Codes in the Global Merge Regime

Saransh Chopra, Shubhransh Singhvi, K.V. Rashmi · 2026

Recent studies have shown that distributed storage systems can achieve significant space savings by adapting redundancy levels to varying disk failure rates. This adaptation is performed via code conv…

Read Paper →

Physics Preprint PDF DOI

FAST and Dark: A catalogue of Dark Galaxy Candidates within 50 Mpc

Marco Monaci, Duncan A. Forbes, Jonah S. Gannon, Barbel S. Koribalski, Kenji Bekki, Jean P. Brodie, Warrick J. Couch · 2026

Using the first data release of the Five-hundred-meter Aperture Spherical radio Telescope (FAST) All-Sky HI survey (FASHI), we compile a catalogue of 70 dark galaxy candidates (DGCs) within 50 Mpc. We…

Read Paper →

Physics Preprint PDF DOI

Optical identification of the FASHI sources: toward the extended Local Volume

Aleksandra E. Nazarova, Dmitry I. Makarov, Igor D. Karachentsev, Chuan-Peng Zhang, Maksim I. Chazov, Ming Zhu · 2026

We extracted a list of 662 nearby (within $\sim16$ Mpc) HI-detection sources from the Five-hundred-meter Aperture Spherical radio Telescope (FAST) All Sky HI Survey (FASHI) and made a visual identific…

Read Paper →

Computer Science Preprint PDF DOI

Script Collapse in Multilingual ASR: Defining and Measuring Script Fidelity Rate

Hanif Rahman · 2026

Word error rate (WER) is the dominant metric for automatic speech recognition, yet it cannot detect a systematic failure mode: models that produce fluent output in the wrong writing system. We define …

Read Paper →

AI & Data Science Preprint PDF DOI

Fine-tuning Whisper for Pashto ASR: strategies and scale

Hanif Rahman · 2026

Pashto is absent from Whisper's pre-training corpus despite being one of CommonVoice's largest language collections, leaving off-the-shelf models unusable: all Whisper sizes output Arabic, Dari, or Ur…

Read Paper →

AI & Data Science Preprint PDF DOI

Neural Assistive Impulses: Synthesizing Exaggerated Motions for Physics-based Characters

Zhiquan Wang, Bedrich Benes · 2026

Physics-based character animation has become a fundamental approach for synthesizing realistic, physically plausible motions. While current data-driven deep reinforcement learning (DRL) methods can sy…

Read Paper →

AI & Data Science Preprint PDF DOI

Benchmarking Multilingual Speech Models on Pashto: Zero-Shot ASR, Script Failure, and Cross-Domain Evaluation

Hanif Rahman · 2026

Pashto is spoken by approximately 60--80 million people but has no published benchmarks for multilingual automatic speech recognition (ASR) on any shared public test set. This paper reports the first …

Read Paper →

Physics Preprint PDF DOI

A critical analysis of main-sequence fitting in open clusters to derive the helium-to-metal enrichment ratio $\Delta Y/\Delta Z$

G. Valle, N. Ricci, M. Dell'Omodarme, P.G. Prada Moroni, S. Degl'Innocenti, S. Cassisi · 2026

We aim to investigate the feasibility of accurately determining the helium-to-metal enrichment ratio $\Delta Y/\Delta Z$ for open clusters using Gaia DR3 photometry. To test the reliability of this ca…

Read Paper →

Mathematics Preprint PDF DOI

Cliques in graphs constructed from Strongly Orthogonal Subsets in exceptional root systems

Patrick J. Browne, Padraig O Cathain · 2026

Given a root system $R$, two roots are said to be \emph{strongly orthogonal} if neither their sum nor difference is a root. Gashi defined a family of graphs with vertices labelled by sums of $k$-eleme…

Read Paper →

Physics Preprint PDF DOI

Study of Integrated Far-ultraviolet Emissions from Galactic Globular Clusters using AstroSat/UVIT observations

Sonika Piridi, Ranjan Kumar, Divya Pandey, Ananta C. Pradhan · 2026

We used observations obtained with the Ultraviolet Imaging Telescope on board the AstroSat satellite to measure the integrated far-ultraviolet (FUV) and optical (V) magnitudes of 30 Galactic globular …

Read Paper →

AI & Data Science Preprint PDF DOI

Pashto Common Voice: Building the First Open Speech Corpus for a 60-Million-Speaker Low-Resource Language

Hanif Rahman, Shafeeq ur Rehman · 2026

We present the Pashto Common Voice corpus -- the first large-scale, openly licensed speech resource for Pashto, a language with over 60 million native speakers largely absent from open speech technolo…

Read Paper →

AI & Data Science Preprint PDF DOI

The Last Fingerprint: How Markdown Training Shapes LLM Prose

E. M. Freeburg · 2026

Large language models produce em dashes at varying rates, and the observation that some models "overuse" them has become one of the most widely discussed markers of AI-generated text. Yet no mechanist…

Read Paper →

Mathematics Preprint PDF DOI

Pseudofiniteness of the Farey Graph

Connor Martinez Lockhart · 2026

We prove that the theory of the Farey graph is pseudofinite by constructing a sequence of finite structures that satisfy increasingly large subsets of its first-order axiomatization. This graph is an …

Read Paper →

Physics Preprint PDF DOI

HI Gas and Star Formation in Major Galaxy Pairs from the FAST All-Sky HI Survey (FASHI)

Shulan Yan, Qingzheng Yu, Taotao Fang, Chuan He, Andrew Ma, Junfeng Wang, C.Kevin Xu, Ming Zhu, Weishan Zhu · 2026

Atomic hydrogen (HI) plays a fundamental role in fueling star formation in galaxies. However, the behavior of HI gas in interacting systems, particularly galaxy pairs, remains elusive. In this work, w…

Read Paper →

Mathematics Preprint PDF DOI

Strong spectral gap for geometrically finite hyperbolic manifolds

Dubi Kelmer, Osama Khalil, Pratyush Sarkar · 2026

Let $\Gamma < G := \operatorname{SO}(d+1, 1)$ for $d \geq 1$ be a Zariski dense, geometrically finite, discrete subgroup with critical exponent strictly greater than $d/2$. We show that $L^2(\Gamma\ba…

Read Paper →

AI & Data Science Preprint PDF DOI

PashtoCorp: A 1.25-Billion-Word Corpus, Evaluation Suite, and Reproducible Pipeline for Low-Resource Language Development

Hanif Rahman · 2026

We present PashtoCorp, a 1.25-billion-word corpus for Pashto, a language spoken by 60 million people that remains severely underrepresented in NLP. The corpus is assembled from 39 sources spanning sev…

Read Paper →

Computer Science Preprint PDF DOI

DASH: Dynamic Audio-Driven Semantic Chunking for Efficient Omnimodal Token Compression

Bingzhou Li, Tao Huang · 2026

Omnimodal large language models (OmniLLMs) jointly process audio and visual streams, but the resulting long multimodal token sequences make inference prohibitively expensive. Existing compression meth…

Read Paper →

AI & Data Science Preprint PDF DOI

OasisSimp: An Open-source Asian-English Sentence Simplification Dataset

Hannah Liu, Muxin Tian, Iqra Ali, Haonan Gao, Qiaoyiwen Wu, Blair Yang, Uthayasanker Thayasivam, En-Shiun Annie Lee, Pakawat Nakwijit, Surangika Ranathunga, Ravi Shekhar · 2026

Sentence simplification aims to make complex text more accessible by reducing linguistic complexity while preserving the original meaning. However, progress in this area remains limited for mid-resour…

Read Paper →

Browse Research Papers

DASH-KV: Accelerating Long-Context LLM Inference via Asymmetric KV Cache Hashing

End-to-End Performance of Video Streaming With MPEG-DASH Over Satellite 5G IAB Networks

Bandwidth Cost of Locally Repairable Convertible Codes in the Global Merge Regime

FAST and Dark: A catalogue of Dark Galaxy Candidates within 50 Mpc

Optical identification of the FASHI sources: toward the extended Local Volume

Script Collapse in Multilingual ASR: Defining and Measuring Script Fidelity Rate

Fine-tuning Whisper for Pashto ASR: strategies and scale

Neural Assistive Impulses: Synthesizing Exaggerated Motions for Physics-based Characters

Benchmarking Multilingual Speech Models on Pashto: Zero-Shot ASR, Script Failure, and Cross-Domain Evaluation

A critical analysis of main-sequence fitting in open clusters to derive the helium-to-metal enrichment ratio $\Delta Y/\Delta Z$

Cliques in graphs constructed from Strongly Orthogonal Subsets in exceptional root systems

Study of Integrated Far-ultraviolet Emissions from Galactic Globular Clusters using AstroSat/UVIT observations

Pashto Common Voice: Building the First Open Speech Corpus for a 60-Million-Speaker Low-Resource Language

The Last Fingerprint: How Markdown Training Shapes LLM Prose

Pseudofiniteness of the Farey Graph

HI Gas and Star Formation in Major Galaxy Pairs from the FAST All-Sky HI Survey (FASHI)

Strong spectral gap for geometrically finite hyperbolic manifolds

PashtoCorp: A 1.25-Billion-Word Corpus, Evaluation Suite, and Reproducible Pipeline for Low-Resource Language Development

DASH: Dynamic Audio-Driven Semantic Chunking for Efficient Omnimodal Token Compression

OasisSimp: An Open-source Asian-English Sentence Simplification Dataset

Browse by Category

Research Type

Publish Your Research