101+ open-access research outputs.
The recently introduced model of representations has been defined and motivated somewhat ex-nihilo. In this document, I will show that representations are related to a more ''classical'' model through…
Non-verbal vocalizations (NVVs) like laugh, sigh, and sob are essential for human-like speech, yet standardized evaluation remains limited in jointly assessing whether systems can generate the intende…
This work introduces an ultralow-power voltage-to-spike encoder that achieves near-linear voltage-to-firing-rate conversion by pairing a linearized bulk-driven transconductor with a DPI-based LIF neur…
Smooth words over an alphabet of non-negative integers $\{a,b\}$ are infinite words that are infinitely derivable, the most famous example being the Oldenburger-Kolakoski word over $\{1,2\}$. The main…
Large language model (LLM)-driven automated program repair (APR) has advanced rapidly, but most methods remain code-centric: they directly rewrite source code and thereby risk hallucinated, behavioral…
Automated singing assessment is crucial for education and entertainment. However, existing systems face two fundamental limitations: reliance on reference tracks, which stifles creative expression, an…
We introduce a broadcast model called the singing model, where agents are oblivious of the size and structure of the communication network, even their immediate neighborhood. Agents can sing multiple …
Multi-channel audio alignment is a key requirement in bioacoustic monitoring, spatial audio systems, and acoustic localization. However, existing methods often struggle to address nonlinear clock drif…
Multimodal relation extraction (MRE) is a crucial task in the fields of Knowledge Graph and Multimedia, playing a pivotal role in multimodal knowledge graph construction. However, existing methods are…
There is much discussion of the false outputs that generative AI systems such as ChatGPT, Claude, Gemini, DeepSeek, and Grok create. In popular terminology, these have been dubbed AI hallucinations. H…
Integrating spatial context into large language models (LLMs) has the potential to revolutionize human-computer interaction, particularly in wearable devices. In this work, we present a novel system a…
Resistive crossbars enabling analog In-Memory Computing (IMC) have emerged as a promising architecture for Deep Neural Network (DNN) acceleration, offering high memory bandwidth and in-situ computatio…
The purpose of this note is to rectify a typographical error in the statements of Theorems 5.5 and 5.6 of Sharma, Chauhan and Singh[3] and further analyze and discuss the significance of the results d…
Oredango puzzle, one of the pencil puzzles, was originally created by Kanaiboshi and published in the popular puzzle magazine Nikoli. In this paper, we show NP- and ASP-completeness of Oredango by con…
Marine ecosystem monitoring via Passive Acoustic Monitoring (PAM) generates vast data, but deep learning often requires precise annotations and short segments. We introduce DSMIL-LocNet, a Multiple In…
Understanding how cognitive and social mechanisms shape the evolution of complex artifacts such as songs is central to cultural evolution research. Social network topology (what artifacts are availabl…
Evolomino is a pencil-and-paper logic puzzle popularized by the Japanese publisher Nikoli (like Sudoku, Kakuro, Slitherlink, Masyu, and Fillomino). The puzzle's name reflects its core mechanic: the sh…
We propose a unified framework for Singing Voice Synthesis (SVS) and Conversion (SVC), addressing the limitations of existing approaches in cross-domain SVS/SVC, poor output musicality, and scarcity o…
The ability to predict a user's information need would have wide-ranging implications, from saving time and effort to mitigating vocabulary gaps. We study how to interactively predict a user's informa…
We participated in track 2 of the VoiceMOS Challenge 2024, which aimed to predict the mean opinion score (MOS) of singing samples. Our submission secured the first place among all participating teams,…
Free open-access publishing with Google Scholar indexing.
Submission Guide →