226+ open-access research outputs.
The rapid growth of LLMs demands high-throughput, memory-capacity-intensive inference on resource-constrained edge devices, where single-batch decoding remains fundamentally memory-bound. Existing out…
Recommendation system has gained a large popularity for a variety of personalized suggestion tasks, but the ever-increasing number of user data makes real-time processing of recommendation systems dif…
We study the computational complexity of approximately computing the partition function of a spin system. Techniques based on standard counting-to-sampling reductions yield $\tilde{O}(n^2)$-time algor…
Semiconductor intellectual property (IP) theft incurs hundreds of billions in annual losses, driven by advanced reverse engineering (RE) techniques. Traditional ``cryptic'' IC camouflaging methods typ…
Virtual Reality (VR) emphasizes immersive experiences, while text entry often requires hands or visual attention, which may disrupt the interaction flows in VR. We present AnkleType, a hand- and eye-f…
Solid-state storage architectures based on NAND or emerging memory devices (SSD), are fundamentally architected and optimized for both reliability and performance. Achieving these simultaneous goals r…
Retrieval-Augmented Generation (RAG) relies on large-scale Approximate Nearest Neighbor Search (ANNS) to retrieve semantically relevant context for large language models. Among ANNS methods, IVF-PQ of…
In this work, we report implementation and performance evaluation of memristor-driven fundamental logic gates, including NOT, AND, NAND, OR, NOR, and XOR, and novel and optimized design of the sequent…
Semiconductor intellectual property (IP) theft incurs estimated annual losses ranging from $225 billion to $600 billion. Despite initiatives like the CHIPS Act, many semiconductor designs remain vulne…
Many operational cloud systems use one or more machine learning models that help them achieve better efficiency and performance. But operators do not have tools to help them understand how each model …
Deploying large language models (LLMs) on edge devices enables personalized agents with strong privacy and low cost. However, with tens to hundreds of billions of parameters, single-batch autoregressi…
Roughgarden (2020) initiates the study of Transaction Fee Mechanisms (TFMs), and posits that the on-chain game of a ``good'' TFM should be on-chain simple (OnCS), i.e., incentive compatible for users …
We present a prototype multi-input gate extension of the publicly available Involution Tool for accurate digital timing simulation and power analysis of integrated circuits introduced by Oehlinger et …
This paper presents an in-memory computing (IMC) architecture developed on an 8x8 array of 8T SRAM cells. This architecture enables both multi-bit parallel Multiply-Accumulate (MAC) operations and sta…
Hyperdimensional Computing (HDC) encodes information and data into high-dimensional distributed vectors that can be manipulated using simple bitwise operations and similarity searches, offering parall…
The advancement of large language models has led to models with billions of parameters, significantly increasing memory and compute demands. Serving such models on conventional hardware is challenging…
Although NAND flash memory has achieved continuous capacity improvements via advanced 3D stacking and multi-level cell technologies, these innovations introduce new reliability challenges, particularl…
In 1987, Jim Gray and Gianfranco Putzolu introduced the five-minute rule, a simple, storage-memory-economics-based heuristic for deciding when data should live in DRAM rather than on storage. Subseque…
Quarter level cell (QLC) 3D NAND flash memory is emerging as the predominant storage solution in the era of artificial intelligence. QLC 3D NAND flash stores 4 bit per cell to expand the storage densi…
Using nationally representative data from the 2020 and 2024 American National Election Studies (ANES), this paper traces how the U.S. social media landscape has shifted across platforms, demographics,…
Free open-access publishing with Google Scholar indexing.
Submission Guide →