Douglas Scott in Computer Science — Research Repository

Computer Science Preprint PDF DOI

Self-Aware Vector Embeddings for Retrieval-Augmented Generation: A Neuroscience-Inspired Framework for Temporal, Confidence-Weighted, and Relational Knowledge

Naizhong Xu · 2026

Modern retrieval-augmented generation (RAG) systems treat vector embeddings as static, context-free artifacts: an embedding has no notion of when it was created, how trustworthy its source is, or whic…

Read Paper →

Computer Science Preprint PDF DOI

Improving LLM-Driven Test Generation by Learning from Mocking Information

Jamie Lee, Flynn Teh, Hengcheng Zhu, Mengzhen Li, Mattia Fazzini, Valerio Terragni · 2026

Large Language Models (LLMs) have recently shown strong potential for automated unit test generation. This has motivated us to investigate whether developer-defined test doubles (commonly referred to …

Read Paper →

Computer Science Preprint PDF DOI

SQL Query Engine: A Self-Healing LLM Pipeline for Natural Language to PostgreSQL Translation

Muhammad Adeel Ijaz · 2026

We present SQL Query Engine, an open-source, self-hosted service that translates natural language questions into validated PostgreSQL queries through a two-stage LLM pipeline. The first stage performs…

Read Paper →

Computer Science Preprint PDF DOI

Online TCP Acknowledgment under General Delays

Sujoy Bhore, Micha{l} Paw{l}owski, Seeun William Umboh · 2026

In a seminal work, Dooly, Goldman, and Scott (STOC 1998; JACM 2001) introduced the classic Online TCP Acknowledgment problem. In this problem, a sequence of $n$ packets arrives over time, and the obje…

Read Paper →

Computer Science Preprint PDF DOI

How Many Tries Does It Take? Iterative Self-Repair in LLM Code Generation Across Model Scales and Benchmarks

Johin Johny Arimbur · 2026

Large language models frequently fail to produce correct code on their first attempt, yet most benchmarks evaluate them in a single-shot setting. We investigate iterative self-repair (feeding executio…

Read Paper →

Computer Science Preprint PDF DOI

A Domain-Theoretic Foundation for Imprecise Probability and Credal Sets

Abbas Edalat, Pietro Di Gianantonio, Amin Farjudian · 2026

We develop a domain-theoretic framework for imprecise probability reasoning and inference on general topological spaces with a countably based continuous lattice of open sets. We address two distinct …

Read Paper →

Computer Science Preprint PDF DOI

Training Language Models for Bilateral Trade with Private Information

Dirk Bergemann, Soheil Ghili, Xinyang Hu, Chuanhao Li, Zhuoran Yang · 2026

Bilateral bargaining under incomplete information provides a controlled testbed for evaluating large language model (LLM) agent capabilities. Bilateral trade demands individual rationality, strategic …

Read Paper →

Computer Science Preprint PDF DOI

Tokalator: A Context Engineering Toolkit for Artificial Intelligence Coding Assistants

Vahid Farajijobehdar, Ilknur Koseoglu Sar{i}, Naz{i}m Kemal Ure, Engin Zeydan · 2026

Artificial Intelligence (AI)-assisted coding environments operate within finite context windows of 128,000-1,000,000 tokens (as of early 2026), yet existing tools offer limited support for monitoring …

Read Paper →

Computer Science Preprint PDF DOI

Spot-and-Scoot: Peeking Into Spot Instance Availability

Kyumin Kim, Moohyun Song, Taeyoon Kim, Kyungyong Lee · 2026

Spot instances offer significant cost savings of up to 90% over on-demand prices, making them an attractive resource for large-scale computing workloads. However, understanding their availability dyna…

Read Paper →

Computer Science Preprint PDF DOI

Signal Constellations with Enhanced Energy Efficiency for High-Speed Communication Systems

Mark Bykhovskiy · 2026

This paper proposes a new method for constructing multidimensional signal constellations (SC), referred to as SCOPT, for high-speed communication systems with enhanced energy efficiency (EE). In contr…

Read Paper →

Computer Science Preprint PDF DOI

A Practical Framework for Flaky Failure Triage in Distributed Database Continuous Integration

Jun-Peng Zhu, Qizhi Wang, Yulong Zhai, Yishen Sun, Sen Chen, Kai Xu, Peng Cai, Hongming Zhang, Heng Long, Liu Tang, Qi Liu · 2026

Flaky failure triage is crucial for keeping distributed database continuous integration (CI) efficient and reliable. After a failure is observed, operators must quickly decide whether to auto-rerun th…

Read Paper →

Computer Science Preprint PDF DOI

Estimating the Social Cost of Corporate Data Breaches

Lina Alkarmi, Armin Sarabi, Mingyan Liu · 2026

While the size of a data breach is typically measured by the number of (consumer, customer, or user) records exposed or compromised, its economic impact is generally measured from the point of view of…

Read Paper →

Computer Science Preprint PDF DOI

Borderless Long Speech Synthesis

Xingchen Song, Di Wu, Dinghao Zhou, Pengyu Cheng, Hongwu Ding, Yunchao He, Jie Wang, Shengfan Shen, Sixiang Lv, Lichun Fan, Hang Su, Yifeng Wang, Shuai Wang, Meng Meng, Jian Luan · 2026

Most existing text-to-speech (TTS) systems either synthesize speech sentence by sentence and stitch the results together, or drive synthesis from plain-text dialogues alone. Both approaches leave mode…

Read Paper →

Computer Science Preprint PDF DOI

Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards

Philipp Normann, Andreas Happe, Jurgen Cito, Daniel Arp · 2026

LLM agents are increasingly relevant to research domains such as vulnerability discovery. Yet, the strongest systems remain closed and cloud-only, making them resource-intensive, difficult to reproduc…

Read Paper →

Computer Science Preprint PDF DOI

The 1/W Law: An Analytical Study of Context-Length Routing Topology and GPU Generation Gains for LLM Inference Energy Efficiency

Huamin Chen, Xunzhuo Liu, Yuhan Liu, Junchen Jiang, Bowei He, Xue Liu · 2026

How many tokens can a GPU inference cluster deliver per watt? Across deployments of identical hardware, the answer varies by 40x -- not because of software inefficiency, but because of the serving con…

Read Paper →

Computer Science Preprint PDF DOI

Huffman-Bucket Sketch: A Simple $O(m)$ Algorithm for Cardinality Estimation

Matti Karppa · 2026

We introduce the Huffman-Bucket Sketch (HBS), a simple, mergeable data structure that losslessly compresses a HyperLogLog (HLL) sketch with $m$ registers to optimal space $O(m+\log n)$ bits, with amor…

Read Paper →

Computer Science Preprint PDF DOI

Beyond Interleaving: Causal Attention Reformulations for Generative Recommender Systems

Hailing Cheng · 2026

Generative Recommender Systems (GR) increasingly model user behavior as a sequence generation task by interleaving item and action tokens. While effective, this formulation introduces significant stru…

Read Paper →

Computer Science Preprint PDF DOI

A Randomized Controlled Trial and Pilot of Scout: an LLM-Based EHR Search and Synthesis Platform

Michael Gao, Suresh Balu, William Knechtle, Kartik Pejavara, William Jeck, Matthew Ellis, Jason Thieling, Blake Cameron, Jason Tatreau, Tareq Aljurf, Henry Foote, Michael Revoir, Marshall Nichols, Matthew Gardner, William Ratliff, Bradley Hintze, Angelo Milazzo, Sreekanth Vemulapalli · 2026

Clinical documentation and data retrieval within Electronic Health Records (EHRs) contribute substantially to clinician workload and burnout. To address this, we developed Scout, an LLM-based EHR sear…

Read Paper →

Computer Science Preprint PDF DOI

Diagonalizing Through the $\omega$-Chain: Iterated Self-Certification on Bounded Turing Machines and its Least Fixed Point

Miara Sung · 2026

Bounded self-certification in Turing machines fails because self-simulation necessarily incurs a strictly positive temporal overhead. We translate this operational constraint into a domain-theoretic f…

Read Paper →

Computer Science Preprint PDF DOI

Constructive and Predicative Locale Theory in Univalent Foundations

Ayberk Tosun · 2026

We develop locale theory constructively and predicatively in univalent foundations (UF), with a particular focus on the theory of spectral and Stone locales. In the context of UF, predicativity refers…

Read Paper →

Browse Research Papers

Self-Aware Vector Embeddings for Retrieval-Augmented Generation: A Neuroscience-Inspired Framework for Temporal, Confidence-Weighted, and Relational Knowledge

Improving LLM-Driven Test Generation by Learning from Mocking Information

SQL Query Engine: A Self-Healing LLM Pipeline for Natural Language to PostgreSQL Translation

Online TCP Acknowledgment under General Delays

How Many Tries Does It Take? Iterative Self-Repair in LLM Code Generation Across Model Scales and Benchmarks

A Domain-Theoretic Foundation for Imprecise Probability and Credal Sets

Training Language Models for Bilateral Trade with Private Information

Tokalator: A Context Engineering Toolkit for Artificial Intelligence Coding Assistants

Spot-and-Scoot: Peeking Into Spot Instance Availability

Signal Constellations with Enhanced Energy Efficiency for High-Speed Communication Systems

A Practical Framework for Flaky Failure Triage in Distributed Database Continuous Integration

Estimating the Social Cost of Corporate Data Breaches

Borderless Long Speech Synthesis

Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards

The 1/W Law: An Analytical Study of Context-Length Routing Topology and GPU Generation Gains for LLM Inference Energy Efficiency

Huffman-Bucket Sketch: A Simple $O(m)$ Algorithm for Cardinality Estimation

Beyond Interleaving: Causal Attention Reformulations for Generative Recommender Systems

A Randomized Controlled Trial and Pilot of Scout: an LLM-Based EHR Search and Synthesis Platform

Diagonalizing Through the $\omega$-Chain: Iterated Self-Certification on Bounded Turing Machines and its Least Fixed Point

Constructive and Predicative Locale Theory in Univalent Foundations

Browse by Category

Research Type

Publish Your Research