918+ open-access research outputs.
Distributed GPU applications increasingly rely on kernel-level, cross-node coordination to reduce launch overheads and improve compute-communication overlap, but such support is lacking. On OFI-based …
With network requirements diverging across emerging applications, latency-critical services demand minimal logic delay, while hyperscale training and collectives require sustained line-rate throughput…
Approximate nearest neighbor (ANN) search in AI systems increasingly handles sensitive data on third-party infrastructure. Trusted execution environments (TEEs) offer protection, but cost-efficient de…
Blocklisting is a common technique for preventing the use of known malicious content. However, conventional blocklisting infrastructures require either the blocklist to be public or clients to reveal …
Real-world video creation often involves a complex reasoning workflow of selecting relevant shots from noisy materials, planning missing shots for narrative completeness, and organizing them into cohe…
Large language models (LLMs) are increasingly deployed as the execution core of autonomous agents rather than as standalone text generators. Agentic workloads induce a temporal shift from single-turn …
Vision-Language-Action (VLA) models have emerged as the mainstream of embodied intelligence. Recent VLA models have expanded their input modalities from 2D-only to 2D+3D paradigms, forming multi-visua…
Memory Dependence Prediction (MDP) is a speculative technique to determine which stores, if any, a given load will depend on. Area-constrained cores are increasingly relevant in various applications s…
Personal AI systems increasingly retain long-term memory of user activity, including documents, emails, messages, meetings, and ambient recordings. Trusted hardware can keep this data private, but str…
In this paper, I evaluate the risks of an AI criminal mastermind, an AI agent capable of planning, coordinating, and committing a crime through the onboarding of human collaborators ('taskers'). In he…
AI agents -- systems that can independently take actions to pursue complex goals with only limited human oversight -- have entered the mainstream. These systems are now being widely used to produce so…
Quorum design over asymmetric topologies conflates two independent concerns: inter-tier obligation (which tiers must participate for cross-tier safety) and intra-tier replication (how each tier surviv…
Large language model (LLM)-based AI agents are increasingly capable of complex clinical reasoning and may soon participate in medical decision-making with limited or no real-time human oversight. This…
Interplanetary networks (IPNs) present unique challenges such as extreme delay, high loss, and frequent disruptions that severely degrade the performance of conventional transport protocols like Trans…
Gated DeltaNet (GDN) is a linear attention mechanism that replaces the growing KV cache with a fixed-size recurrent state. Hybrid LLMs like Qwen3-Next use 75% GDN layers and achieve competitive accura…
Aggregate Programming (AP) is a paradigm for programming the collective behaviour of sets of distributed devices, possibly situated at the network far edge, by relying on asynchronous proximity-based …
We introduce \textbf{Kruskal-EDS} (\emph{Edge Dynamic Stratification}), a distribution-adaptive variant of Kruskal's minimum spanning tree (MST) algorithm that replaces the mandatory $\Theta(m\log m)$…
Timely availability of high-fidelity entanglement is essential for emerging quantum networks. This paper introduces the Age of Entanglement (AoE) as a novel performance metric that captures the freshn…
Very soon, millions of AI agents will proliferate across the economy, autonomously taking billions of actions. Inevitably, things will go wrong. Humans will be defrauded, injured, even killed. Law wil…
Probabilistic bits (p-bits) offer an energy-efficient hardware abstraction for stochastic optimization; however, existing p-bit-based simulated annealing accelerators suffer from poor scalability and …
Free open-access publishing with Google Scholar indexing.
Submission Guide →