328+ open-access research outputs.
Modern retrieval-augmented generation (RAG) systems treat vector embeddings as static, context-free artifacts: an embedding has no notion of when it was created, how trustworthy its source is, or whic…
Large Language Models (LLMs) have recently shown strong potential for automated unit test generation. This has motivated us to investigate whether developer-defined test doubles (commonly referred to …
We present SQL Query Engine, an open-source, self-hosted service that translates natural language questions into validated PostgreSQL queries through a two-stage LLM pipeline. The first stage performs…
In a seminal work, Dooly, Goldman, and Scott (STOC 1998; JACM 2001) introduced the classic Online TCP Acknowledgment problem. In this problem, a sequence of $n$ packets arrives over time, and the obje…
Large language models frequently fail to produce correct code on their first attempt, yet most benchmarks evaluate them in a single-shot setting. We investigate iterative self-repair (feeding executio…
We develop a domain-theoretic framework for imprecise probability reasoning and inference on general topological spaces with a countably based continuous lattice of open sets. We address two distinct …
Bilateral bargaining under incomplete information provides a controlled testbed for evaluating large language model (LLM) agent capabilities. Bilateral trade demands individual rationality, strategic …
Artificial Intelligence (AI)-assisted coding environments operate within finite context windows of 128,000-1,000,000 tokens (as of early 2026), yet existing tools offer limited support for monitoring …
Spot instances offer significant cost savings of up to 90% over on-demand prices, making them an attractive resource for large-scale computing workloads. However, understanding their availability dyna…
This paper proposes a new method for constructing multidimensional signal constellations (SC), referred to as SCOPT, for high-speed communication systems with enhanced energy efficiency (EE). In contr…
Flaky failure triage is crucial for keeping distributed database continuous integration (CI) efficient and reliable. After a failure is observed, operators must quickly decide whether to auto-rerun th…
While the size of a data breach is typically measured by the number of (consumer, customer, or user) records exposed or compromised, its economic impact is generally measured from the point of view of…
Most existing text-to-speech (TTS) systems either synthesize speech sentence by sentence and stitch the results together, or drive synthesis from plain-text dialogues alone. Both approaches leave mode…
LLM agents are increasingly relevant to research domains such as vulnerability discovery. Yet, the strongest systems remain closed and cloud-only, making them resource-intensive, difficult to reproduc…
How many tokens can a GPU inference cluster deliver per watt? Across deployments of identical hardware, the answer varies by 40x -- not because of software inefficiency, but because of the serving con…
We introduce the Huffman-Bucket Sketch (HBS), a simple, mergeable data structure that losslessly compresses a HyperLogLog (HLL) sketch with $m$ registers to optimal space $O(m+\log n)$ bits, with amor…
Generative Recommender Systems (GR) increasingly model user behavior as a sequence generation task by interleaving item and action tokens. While effective, this formulation introduces significant stru…
Clinical documentation and data retrieval within Electronic Health Records (EHRs) contribute substantially to clinician workload and burnout. To address this, we developed Scout, an LLM-based EHR sear…
Bounded self-certification in Turing machines fails because self-simulation necessarily incurs a strictly positive temporal overhead. We translate this operational constraint into a domain-theoretic f…
We develop locale theory constructively and predicatively in univalent foundations (UF), with a particular focus on the theory of spectral and Stone locales. In the context of UF, predicativity refers…
Free open-access publishing with Google Scholar indexing.
Submission Guide →