Nikhil Singh in Computer Science — Research Repository

Computer Science Preprint PDF DOI

A 2-adjunction between representations and preorder morphisms

Paul Brunet (UPEC UP12, LACL) · 2026

The recently introduced model of representations has been defined and motivated somewhat ex-nihilo. In this document, I will show that representations are related to a more ''classical'' model through…

Read Paper →

Computer Science Preprint PDF DOI

NVBench: A Benchmark for Speech Synthesis with Non-Verbal Vocalizations

Liumeng Xue, Weizhen Bian, Jiahao Pan, Wenxuan Wang, Yilin Ren, Boyi Kang, Jingbin Hu, Ziyang Ma, Shuai Wang, Xinyuan Qian, Hung-yi Lee, Yike Guo · 2026

Non-verbal vocalizations (NVVs) like laugh, sigh, and sob are essential for human-like speech, yet standardized evaluation remains limited in jointly assessing whether systems can generate the intende…

Read Paper →

Computer Science Preprint PDF DOI

A 0.5-V Linear Neuromorphic Voltage-to-Spike Encoder Using a Bulk-Driven Transconductor

Meysam Akbari, Erika Covi, Kea-Tiong Tang · 2026

This work introduces an ultralow-power voltage-to-spike encoder that achieves near-linear voltage-to-firing-rate conversion by pairing a linearized bulk-driven transconductor with a DPI-based LIF neur…

Read Paper →

Computer Science Preprint PDF DOI

The complexity of finite smooth words over binary alphabets

Julien Cassaigne, Raphael Henry · 2026

Smooth words over an alphabet of non-negative integers $\{a,b\}$ are infinite words that are infinitely derivable, the most famous example being the Oldenburger-Kolakoski word over $\{1,2\}$. The main…

Read Paper →

Computer Science Preprint PDF DOI

Specification Vibing for Automated Program Repair

Taohong Zhu, Lucas C. Cordeiro, Mustafa A. Mustafa, Youcheng Sun · 2026

Large language model (LLM)-driven automated program repair (APR) has advanced rapidly, but most methods remain code-centric: they directly rewrite source code and thereby risk hallucinated, behavioral…

Read Paper →

Computer Science Preprint PDF DOI

Singing Timbre Popularity Assessment Based on Multimodal Large Foundation Model

Zihao Wang, Ruibin Yuan, Ziqi Geng, Hengjia Li, Xingwei Qu, Xinyi Li, Songye Chen, Haoying Fu, Roger B. Dannenberg, Kejun Zhang · 2025

Automated singing assessment is crucial for education and entertainment. However, existing systems face two fundamental limitations: reliance on reference tracks, which stifles creative expression, an…

Read Paper →

Computer Science Preprint PDF DOI

Singing a MIS

Sandy Irani, Michael Luby · 2025

We introduce a broadcast model called the singing model, where agents are oblivious of the size and structure of the communication network, even their immediate neighborhood. Agents can sing multiple …

Read Paper →

Computer Science Preprint PDF DOI

Cross-Attention with Confidence Weighting for Multi-Channel Audio Alignment

Ragib Amin Nihal, Benjamin Yen, Takeshi Ashizawa, Kazuhiro Nakadai · 2025

Multi-channel audio alignment is a key requirement in bioacoustic monitoring, spatial audio systems, and acoustic localization. However, existing methods often struggle to address nonlinear clock drif…

Read Paper →

Computer Science Preprint PDF DOI

REMOTE: A Unified Multimodal Relation Extraction Framework with Multilevel Optimal Transport and Mixture-of-Experts

Xinkui Lin, Yongxiu Xu, Minghao Tang, Shilong Zhang, Hongbo Xu, Hao Xu, Yubin Wang · 2025

Multimodal relation extraction (MRE) is a crucial task in the fields of Knowledge Graph and Multimedia, playing a pivotal role in multimodal knowledge graph construction. However, existing methods are…

Read Paper →

Computer Science Preprint PDF DOI

Hallucinating with AI: AI Psychosis as Distributed Delusions

Lucy Osler · 2025

There is much discussion of the false outputs that generative AI systems such as ChatGPT, Claude, Gemini, DeepSeek, and Grok create. In popular terminology, these have been dubbed AI hallucinations. H…

Read Paper →

Computer Science Preprint PDF DOI

Spatial Audio Processing with Large Language Model on Wearable Devices

Ayushi Mishra, Yang Bai, Priyadarshan Narayanasamy, Nakul Garg, Nirupam Roy · 2025

Integrating spatial context into large language models (LLMs) has the potential to revolutionize human-computer interaction, particularly in wearable devices. In this work, we present a novel system a…

Read Paper →

Computer Science Preprint PDF DOI

LIMCA: LLM for Automating Analog In-Memory Computing Architecture Design Exploration

Deepak Vungarala, Md Hasibul Amin, Pietro Mercati, Arnob Ghosh, Arman Roohi, Ramtin Zand, Shaahin Angizi · 2025

Resistive crossbars enabling analog In-Memory Computing (IMC) have emerged as a promising architecture for Deep Neural Network (DNN) acceleration, offering high memory bandwidth and in-situ computatio…

Read Paper →

Computer Science Preprint PDF DOI

Some remarks on the results derived by Ramy Takieldin and Patrick Sol\'e (2025)

Varsha Chauhan, Anuradha Sharma · 2025

The purpose of this note is to rectify a typographical error in the statements of Theorems 5.5 and 5.6 of Sharma, Chauhan and Singh[3] and further analyze and discuss the significance of the results d…

Read Paper →

Computer Science Preprint PDF DOI

Computational Complexity and Integer Programming Formulation of the Oredango Puzzle

Takuma Takahata, Norito Minamikawa, Takayuki Okuno · 2025

Oredango puzzle, one of the pencil puzzles, was originally created by Kanaiboshi and published in the popular puzzle magazine Nikoli. In this paper, we show NP- and ASP-completeness of Oredango by con…

Read Paper →

Computer Science Preprint PDF DOI

Weakly Supervised Multiple Instance Learning for Whale Call Detection and Temporal Localization in Long-Duration Passive Acoustic Monitoring

Ragib Amin Nihal, Benjamin Yen, Runwu Shi, Kazuhiro Nakadai · 2025

Marine ecosystem monitoring via Passive Acoustic Monitoring (PAM) generates vast data, but deep learning often requires precise annotations and short segments. We introduce DSMIL-LocNet, a Multiple In…

Read Paper →

Computer Science Preprint PDF DOI

Characterizing the Interaction of Cultural Evolution Mechanisms in Experimental Social Networks

Raja Marjieh, Manuel Anglada-Tort, Thomas L. Griffiths, Nori Jacoby · 2025

Understanding how cognitive and social mechanisms shape the evolution of complex artifacts such as songs is central to cultural evolution research. Social network topology (what artifacts are availabl…

Read Paper →

Computer Science Preprint PDF DOI

Evolomino is NP-complete

Andrei V. Nikolaev · 2025

Evolomino is a pencil-and-paper logic puzzle popularized by the Japanese publisher Nikoli (like Sudoku, Kakuro, Slitherlink, Masyu, and Fillomino). The puzzle's name reflects its core mechanic: the sh…

Read Paper →

Computer Science Preprint PDF DOI

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference

Shuqi Dai, Yunyun Wang, Roger B. Dannenberg, Zeyu Jin · 2025

We propose a unified framework for Singing Voice Synthesis (SVS) and Conversion (SVC), addressing the limitations of existing approaches in cross-domain SVS/SVC, poor output musicality, and scarcity o…

Read Paper →

Computer Science Preprint PDF DOI

Interactive Information Need Prediction with Intent and Context

Kevin Ros, Dhyey Pandya, ChengXiang Zhai · 2025

The ability to predict a user's information need would have wide-ranging implications, from saving time and effort to mitigating vocabulary gaps. We study how to interactively predict a user's informa…

Read Paper →

Computer Science Preprint PDF DOI

Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion

Yu-Fei Shi, Yang Ai, Ye-Xin Lu, Hui-Peng Du, Zhen-Hua Ling · 2024

We participated in track 2 of the VoiceMOS Challenge 2024, which aimed to predict the mean opinion score (MOS) of singing samples. Our submission secured the first place among all participating teams,…

Read Paper →

Browse Research Papers

A 2-adjunction between representations and preorder morphisms

NVBench: A Benchmark for Speech Synthesis with Non-Verbal Vocalizations

A 0.5-V Linear Neuromorphic Voltage-to-Spike Encoder Using a Bulk-Driven Transconductor

The complexity of finite smooth words over binary alphabets

Specification Vibing for Automated Program Repair

Singing Timbre Popularity Assessment Based on Multimodal Large Foundation Model

Singing a MIS

Cross-Attention with Confidence Weighting for Multi-Channel Audio Alignment

REMOTE: A Unified Multimodal Relation Extraction Framework with Multilevel Optimal Transport and Mixture-of-Experts

Hallucinating with AI: AI Psychosis as Distributed Delusions

Spatial Audio Processing with Large Language Model on Wearable Devices

LIMCA: LLM for Automating Analog In-Memory Computing Architecture Design Exploration

Some remarks on the results derived by Ramy Takieldin and Patrick Sol\'e (2025)

Computational Complexity and Integer Programming Formulation of the Oredango Puzzle

Weakly Supervised Multiple Instance Learning for Whale Call Detection and Temporal Localization in Long-Duration Passive Acoustic Monitoring

Characterizing the Interaction of Cultural Evolution Mechanisms in Experimental Social Networks

Evolomino is NP-complete

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference

Interactive Information Need Prediction with Intent and Context

Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion

Browse by Category

Research Type

Publish Your Research