Rishabh Singh in Computer Science — Research Repository

Computer Science Preprint PDF DOI

NVBench: A Benchmark for Speech Synthesis with Non-Verbal Vocalizations

Liumeng Xue, Weizhen Bian, Jiahao Pan, Wenxuan Wang, Yilin Ren, Boyi Kang, Jingbin Hu, Ziyang Ma, Shuai Wang, Xinyuan Qian, Hung-yi Lee, Yike Guo · 2026

Non-verbal vocalizations (NVVs) like laugh, sigh, and sob are essential for human-like speech, yet standardized evaluation remains limited in jointly assessing whether systems can generate the intende…

Read Paper →

Computer Science Preprint PDF DOI

A 0.5-V Linear Neuromorphic Voltage-to-Spike Encoder Using a Bulk-Driven Transconductor

Meysam Akbari, Erika Covi, Kea-Tiong Tang · 2026

This work introduces an ultralow-power voltage-to-spike encoder that achieves near-linear voltage-to-firing-rate conversion by pairing a linearized bulk-driven transconductor with a DPI-based LIF neur…

Read Paper →

Computer Science Preprint PDF DOI

The complexity of finite smooth words over binary alphabets

Julien Cassaigne, Raphael Henry · 2026

Smooth words over an alphabet of non-negative integers $\{a,b\}$ are infinite words that are infinitely derivable, the most famous example being the Oldenburger-Kolakoski word over $\{1,2\}$. The main…

Read Paper →

Computer Science Preprint PDF DOI

Specification Vibing for Automated Program Repair

Taohong Zhu, Lucas C. Cordeiro, Mustafa A. Mustafa, Youcheng Sun · 2026

Large language model (LLM)-driven automated program repair (APR) has advanced rapidly, but most methods remain code-centric: they directly rewrite source code and thereby risk hallucinated, behavioral…

Read Paper →

Computer Science Preprint PDF DOI

Singing Timbre Popularity Assessment Based on Multimodal Large Foundation Model

Zihao Wang, Ruibin Yuan, Ziqi Geng, Hengjia Li, Xingwei Qu, Xinyi Li, Songye Chen, Haoying Fu, Roger B. Dannenberg, Kejun Zhang · 2025

Automated singing assessment is crucial for education and entertainment. However, existing systems face two fundamental limitations: reliance on reference tracks, which stifles creative expression, an…

Read Paper →

Computer Science Preprint PDF DOI

Singing a MIS

Sandy Irani, Michael Luby · 2025

We introduce a broadcast model called the singing model, where agents are oblivious of the size and structure of the communication network, even their immediate neighborhood. Agents can sing multiple …

Read Paper →

Computer Science Preprint PDF DOI

Hallucinating with AI: AI Psychosis as Distributed Delusions

Lucy Osler · 2025

There is much discussion of the false outputs that generative AI systems such as ChatGPT, Claude, Gemini, DeepSeek, and Grok create. In popular terminology, these have been dubbed AI hallucinations. H…

Read Paper →

Computer Science Preprint PDF DOI

Spatial Audio Processing with Large Language Model on Wearable Devices

Ayushi Mishra, Yang Bai, Priyadarshan Narayanasamy, Nakul Garg, Nirupam Roy · 2025

Integrating spatial context into large language models (LLMs) has the potential to revolutionize human-computer interaction, particularly in wearable devices. In this work, we present a novel system a…

Read Paper →

Computer Science Preprint PDF DOI

Some remarks on the results derived by Ramy Takieldin and Patrick Sol\'e (2025)

Varsha Chauhan, Anuradha Sharma · 2025

The purpose of this note is to rectify a typographical error in the statements of Theorems 5.5 and 5.6 of Sharma, Chauhan and Singh[3] and further analyze and discuss the significance of the results d…

Read Paper →

Computer Science Preprint PDF DOI

Characterizing the Interaction of Cultural Evolution Mechanisms in Experimental Social Networks

Raja Marjieh, Manuel Anglada-Tort, Thomas L. Griffiths, Nori Jacoby · 2025

Understanding how cognitive and social mechanisms shape the evolution of complex artifacts such as songs is central to cultural evolution research. Social network topology (what artifacts are availabl…

Read Paper →

Computer Science Preprint PDF DOI

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference

Shuqi Dai, Yunyun Wang, Roger B. Dannenberg, Zeyu Jin · 2025

We propose a unified framework for Singing Voice Synthesis (SVS) and Conversion (SVC), addressing the limitations of existing approaches in cross-domain SVS/SVC, poor output musicality, and scarcity o…

Read Paper →

Computer Science Preprint PDF DOI

Interactive Information Need Prediction with Intent and Context

Kevin Ros, Dhyey Pandya, ChengXiang Zhai · 2025

The ability to predict a user's information need would have wide-ranging implications, from saving time and effort to mitigating vocabulary gaps. We study how to interactively predict a user's informa…

Read Paper →

Computer Science Preprint PDF DOI

Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion

Yu-Fei Shi, Yang Ai, Ye-Xin Lu, Hui-Peng Du, Zhen-Hua Ling · 2024

We participated in track 2 of the VoiceMOS Challenge 2024, which aimed to predict the mean opinion score (MOS) of singing samples. Our submission secured the first place among all participating teams,…

Read Paper →

Computer Science Preprint PDF DOI

Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations

Quoc-Huy Trinh, Minh-Van Nguyen, Trong-Hieu Nguyen Mau, Khoa Tran, Thanh Do · 2024

Singing is one of the most cherished forms of human entertainment. However, creating a beautiful song requires an accompaniment that complements the vocals and aligns well with the song instruments an…

Read Paper →

Computer Science Preprint PDF DOI

An approach to hummed-tune and song sequences matching

Loc Bao Pham, Huong Hoang Luong, Phu Thien Tran, Phuc Hoang Ngo, Vi Hoang Nguyen, Thinh Nguyen · 2024

Melody stuck in your head, also known as "earworm", is tough to get rid of, unless you listen to it again or sing it out loud. But what if you can not find the name of that song? It must be an intoler…

Read Paper →

Computer Science Preprint PDF DOI

2022 Flood Impact in Pakistan: Remote Sensing Assessment of Agricultural and Urban Damage

Aqs Younas, Arbaz Khan, Hafiz Muhammad Abubakar, Zia Tahseen, Aqeel Arshad, Murtaza Taj, Usman Nazir · 2024

Pakistan was hit by the world's deadliest flood in June 2022, causing agriculture and infrastructure damage across the country. Remote sensing technology offers a cost-effective and efficient method f…

Read Paper →

Computer Science Preprint PDF DOI

FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs

Hitoshi Suda, Shunsuke Yoshida, Tomohiko Nakamura, Satoru Fukayama, Jun Ogata · 2024

This study presents FruitsMusic, a metadata corpus of Japanese idol-group songs in the real world, precisely annotated with who sings what and when. Japanese idol-group songs, vital to Japanese pop cu…

Read Paper →

Computer Science Preprint PDF DOI

Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations

Wangjin Zhou, Fengrun Zhang, Yiming Liu, Wenhao Guan, Yi Zhao, Tatsuya Kawahara · 2024

This study presents an innovative Zero-Shot any-to-any Singing Voice Conversion (SVC) method, leveraging a novel clustering-based phoneme representation to effectively separate content, timbre, and si…

Read Paper →

Computer Science Preprint PDF DOI

Discrepancy Algorithms for the Binary Perceptron

Shuangping Li, Tselil Schramm, Kangjie Zhou · 2024

The binary perceptron problem asks us to find a sign vector in the intersection of independently chosen random halfspaces with intercept $-\kappa$. We analyze the performance of the canonical discrepa…

Read Paper →

Computer Science Preprint PDF DOI

Generating Music with Structure Using Self-Similarity as Attention

Sophia Hager, Kathleen Hablutzel, Katherine M. Kinnaird · 2024

Despite the innovations in deep learning and generative AI, creating long term structure as well as the layers of repeated structure common in musical works remains an open challenge in music generati…

Read Paper →

Browse Research Papers

NVBench: A Benchmark for Speech Synthesis with Non-Verbal Vocalizations

A 0.5-V Linear Neuromorphic Voltage-to-Spike Encoder Using a Bulk-Driven Transconductor

The complexity of finite smooth words over binary alphabets

Specification Vibing for Automated Program Repair

Singing Timbre Popularity Assessment Based on Multimodal Large Foundation Model

Singing a MIS

Hallucinating with AI: AI Psychosis as Distributed Delusions

Spatial Audio Processing with Large Language Model on Wearable Devices

Some remarks on the results derived by Ramy Takieldin and Patrick Sol\'e (2025)

Characterizing the Interaction of Cultural Evolution Mechanisms in Experimental Social Networks

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference

Interactive Information Need Prediction with Intent and Context

Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion

Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations

An approach to hummed-tune and song sequences matching

2022 Flood Impact in Pakistan: Remote Sensing Assessment of Agricultural and Urban Damage

FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs

Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations

Discrepancy Algorithms for the Binary Perceptron

Generating Music with Structure Using Self-Similarity as Attention

Browse by Category

Research Type

Publish Your Research