11,391+ open-access research outputs.
We show that Fr\'echet Distance (FD), long considered impractical as a training objective, can in fact be effectively optimized in the representation space. Our idea is simple: decouple the population…
In modern parametric model training, full-batch gradient descent (and its variants) suffers due to progressively stronger biasing towards the exact realization of training data; this drives the system…
We investigate several aspects of the Bialynicki-Birula decomposition of a smooth complete $\mathbb{G}_m$-variety with finite fixed locus. Our results include novel characterizations of when the Bialy…
Bayesian online learning provides a coherent framework for sequential inference. However, its theoretical understanding remains limited, particularly in the one-pass setting. Existing theoretical guar…
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications. One crucial bottleneck limiting its usage is the expensive computation cost, particularly for a mini-batch …
Recent advances in Diffusion Transformer (DiT)-based video generation technologies have shown impressive results for video object removal. However, these methods still suffer from substantial inferenc…
Drifting models are capable one-step generative models trained to follow a drifting field. The field combines attractive and repulsive softmax-weighted centroids over the data and current-generator di…
Reliability-based topology optimization (RBTO) requires repeated estimation of small failure probabilities and their gradients, making conventional nested Monte Carlo approaches computationally prohib…
This paper develops a deep policy iteration method for high-dimensional finite-horizon mean-field games. We reformulate the game as a regenerative problem with deterministic cycles, which allows polic…
Training large language models requires jointly configuring two interdependent aspects of the system: the global batch size, which governs statistical efficiency, and the 3D parallelism strategy, whic…
A convex polyhedron is Rupert if a hole can be cut into it (making its genus $1$) such that an identical copy of the polyhedron can pass through the hole. Resolving a conjecture of Jerrard-Wetzel-Yuan…
Dynamic quantization emerged as a practical approach to increase the utilization and efficiency of the machine learning serving flow. Unlike static quantization, which applies quantization offline, dy…
Differentially private (DP) contrastive learning aims to learn general-purpose representations from sensitive data, alleviating the privacy leakage concerns of organizations deploying or sharing embed…
Supervised contrastive learning (SupCon) is widely used to shape representations, but has seen limited targeted study for audio deepfake detection. Existing work typically combines contrastive terms w…
The optimal kernel configuration for Mixture-of-Experts (MoE) inference depends on both batch size and the expert routing distribution, yet production systems dispatch from batch size alone, leaving 1…
We prove a structural result for sets of integers with doubling at most $4 + \delta$, with $\delta>0$ sufficiently small. This generalises earlier work of Eberhard--Green--Manners which dealt with set…
Parallel and Distributed Computing (PDC) is a critical yet conceptually challenging area of the undergraduate computer science curriculum. While students often encounter these concepts in theory, few …
The Neganov-Trofimov-Luke (NTL) effect is used by experiments based on cryogenic detectors to boost the sensitivity of light-sensitive devices down to a few optical photons. In this work we introduce …
The rapid growth of LLMs demands high-throughput, memory-capacity-intensive inference on resource-constrained edge devices, where single-batch decoding remains fundamentally memory-bound. Existing out…
Prior work on node classification has shown that Graph Neural Networks (GNNs) can learn representations that transfer across graphs, when underlying graph properties are shared. For a fixed graph, one…
Free open-access publishing with Google Scholar indexing.
Submission Guide →