11,820+ open-access research outputs.
Ad-hoc queries over frequently updated data in a flat schema are common in real-time data analysis applications and often require very low latency. Online aggregation can achieve so by providing appro…
This paper studies a key research question: how to achieve perfect privacy in over-the-air computation (AirComp)? The problem is particularly intriguing due to a dilemma. Real-field operations can ens…
Spiking neural networks (SNNs) are a promising paradigm for energy-efficient event-driven computation, but large-scale SNN execution remains challenging because sparse spike communication and synchron…
Communication has emerged as a critical bottleneck in the distributed training of large language models (LLMs). While numerous approaches have been proposed to reduce communication overhead, the poten…
Access to genomic data is highly regulated due to its sensitive nature. While safeguards are essential, cumbersome data access processes pose a significant barrier to the development of AI methods for…
Pretrial risk assessment tools are used on over one million U.S. defendants each year, yet their use for predicting rare violent re-offense faces a basic statistical barrier. We derive a universal pre…
Radio Access Network (RAN) configuration has traditionally required significant manual effort due to indirect causal dependencies between observable Key Performance Indicators (KPIs), and context-depe…
Category-based coordination mechanisms allocate resources by mapping a declared service category to a fixed resource profile, without observing individual demand types. We establish three results for …
We show that if the conditional distribution p(C | T) factors through a sufficient statistic {\phi}(T), then the Information Bottleneck (IB) problem for (T, C) is exactly equivalent to the IB problem …
Training large language models requires jointly configuring two interdependent aspects of the system: the global batch size, which governs statistical efficiency, and the 3D parallelism strategy, whic…
Python's dynamic nature complicates testing and increases the possibility that some defects evade detection, so an effective fault prediction becomes essential. We examine whether post-release faults …
The Gram matrix is a classical object formed from the pairwise inner products of a collection of vectors, with fundamental roles in functional analysis, statistics, combinatorics, and coding theory. I…
Digital biomarkers for depression have largely relied on static acoustic descriptors, pooled summary statistics, or conventional machine learning representations. Such approaches may miss nonlinear te…
We study detection of collapse in high-dimensional point clouds, where mass concentrates near a lower-dimensional set relative to a non-collapsed geometry. We propose persistent homology-based test st…
Social identity is a concept from psychology that refers to the part of an individual's identity that derives from their group membership(s). In this paper, we explore social identity in members of th…
We propose a human in the loop approach for black-box testing of Functional Mock-up Units (FMUs) using Large Language Models (LLMs). The goal is to reduce the manual effort in defining test scenarios …
We report a striking statistical regularity in frontier LLM outputs that enables a CPU-only scoring primitive running at 2.6 microseconds per token, with estimated latency up to 100,000$\times$ (five …
Generating realistic synthetic citation, patent, or component dependency networks is essential for benchmarking community detection, graph visualisation, and network data mining algorithms. We present…
In benchmarking of Information Retrieval systems, the Wilcoxon signed-rank test is often treated as a safer alternative to the t-test. This belief is fueled by textbooks and recommendations that portr…
Wearable Human Activity Recognition (HAR) still lacks a representation that is both explicit and adaptable. Handcrafted time-series features (TSFs) capture meaningful motion statistics and remain comp…
Free open-access publishing with Google Scholar indexing.
Submission Guide →