292+ open-access research outputs.
FPGA-level emulation is a key step in pre-silicon chip design validation. However, emulating large-scale multi-core systems increasingly exceed the hardware resource capacity of a single FPGA, limitin…
The security of open-source software repositories is increasingly threatened by next-gen software supply chain attacks. These attacks include multiphase malware execution, remote access activation, an…
This paper introduces a search algorithm for index structures based on a B+ tree, specifically optimized for execution on a field-programmable gate array (FPGA). Our implementation efficiently travers…
As modern Coarse Grain Reconfigurable Arrays (CGRAs) grow in size, efficient utilization of the available fabric by a single application becomes increasingly difficult. Existing CGRA mappers either fa…
We present the first kernel-fused SAR Range Doppler pipeline on any GPU platform. By fusing FFT, matched-filter multiply, and IFFT into a single Metal compute dispatch -- keeping all intermediate data…
Authenticated private information retrieval (APIR) is the state-of-the-art error-detecting private information retrieval (ED-PIR), using Distributed Point Functions (DPFs) for subpolynomial complexity…
Modern large language models (LLMs) increasingly depends on efficient long-context processing and generation mechanisms, including sparse attention, retrieval-augmented generation (RAG), and compresse…
Multimodal stacks that mix ViTs, CNNs, GNNs, and transformer NLP strain embedded platforms because their compute/memory patterns diverge and hard real-time targets leave little slack. TRINE is a singl…
Dynamic GNN inference has exhibited effectiveness in High Energy Physics (HEP) experiments at High Luminosity Large Hadron Collider (HL-LHC) due to strong capability to model complex particle interact…
How many tokens can a GPU inference cluster deliver per watt? Across deployments of identical hardware, the answer varies by 40x -- not because of software inefficiency, but because of the serving con…
The rapid proliferation of the Internet of Things (IoT) and smart applications has led to a surge in data generated by distributed sensing devices. Edge computing is a mainstream approach to managing …
Gated DeltaNet (GDN) is a linear attention mechanism that replaces the growing KV cache with a fixed-size recurrent state. Hybrid LLMs like Qwen3-Next use 75% GDN layers and achieve competitive accura…
In long-context large language model (LLM) inference, the prefill stage dominates computation due to self-attention over the complete input context. Sparse attention significantly reduces self-attenti…
Cell-free massive multi-input multi-output (MIMO) promises uniform high performance across the network, but also brings a high energy cost due to joint transmission from distributed radio units (RUs) …
Hyperdimensional Computing (HDC) represents data using extremely high-dimensional, low-precision vectors, termed hypervectors (HVs), and performs learning and inference through lightweight, noise-tole…
Now-a-days, Online Social Networks (OSNs) are extensively used by different commercial houses for viral marketing. The key problem that arises in this context is to choose a limited number of highly i…
We study the generalization error of stochastic learning algorithms from an information-theoretic perspective, with a particular emphasis on deriving sharper bounds for differentially private algorith…
RWKV is a modern RNN architecture that approaches the performance of Transformers, with the advantage of processing long contexts at a linear memory cost. However, its sequential computation pattern s…
Understanding the risks associated with an enterprise environment is the first step toward improving its security. Organizations employ various methods to assess and prioritize the risks identified in…
Homomorphic Encryption (HE) enables secure computation on encrypted data, addressing privacy concerns in cloud computing. However, the high computational cost of HE operations, particularly matrix mul…
Free open-access publishing with Google Scholar indexing.
Submission Guide →