Memory in Computer Science — Research Repository

Computer Science Preprint PDF DOI

DPC: A Distributed Page Cache over CXL

Shai Bergman, Zhe Yang, Julien Eudine, Giorgio Negro, Onur Mutlu, Arash Tavakkol, Ji Zhang · 2026

Modern distributed file systems rely on uncoordinated, per node page caches that replicate hot data locally across the cluster. While ensuring fast local access, this architecture underutilizes aggreg…

Read Paper →

Computer Science Preprint PDF DOI

CROWDio: A Practical Mobile Crowd Computing Framework with Developer-Oriented Design, Adaptive Scheduling, and Fault Resilience

Lakshani Manamperi, Disumi Pathirana, Thiwanka Pathirana, Nipun Premarathna, Kutila Gunasekara · 2026

Mobile Crowd Computing (MCdC) leverages the idle computational capacity of consumer smartphones to enable distributed task processing at scale; however, widespread real-world adoption remains constrai…

Read Paper →

Computer Science Preprint PDF DOI

POLAR-PIC: A Holistic Framework for Matrixized PIC with Co-Designed Compute, Layout, and Communication

Yizhuo Rao, Xingjian Cui, Shangzhi Pang, Jiabin Xie, Guangnan Feng, Jinhui Wei, Ziyan Zhang, Languang Gao, Zhenyu Wang, Zhiguang Chen, Yutong Lu · 2026

Particle-in-Cell (PIC) simulations are fundamental to plasma physics but often suffer from limited scalability due to particle-grid interaction bottlenecks and particle redistribution costs. Specifica…

Read Paper →

Computer Science Preprint PDF DOI

Energy Efficient LSTM Accelerators for Embedded FPGAs through Parameterised Architecture Design

Chao Qian, Tianheng Ling, Gregor Schiele · 2026

Long Short-term Memory Networks (LSTMs) are a vital Deep Learning technique suitable for performing on-device time series analysis on local sensor data streams of embedded devices. In this paper, we p…

Read Paper →

Computer Science Preprint PDF DOI

A Simple Communication Scheme for Distributed Fast Multipole Methods

Srinath Kailasa · 2026

We present a simple hierarchical communication scheme for distributed Fast Multipole Methods (FMMs) based on MPI neighborhood collectives and uniform trees. The method targets the common case of exten…

Read Paper →

Computer Science Preprint PDF DOI

Design Rules for Extreme-Edge Scientific Computing on AI Engines

Zhenghua Ma, G Abarajithan, Dimitrios Danopoulos, Olivia Weng, Francesco Restuccia, Ryan Kastner · 2026

Extreme-edge scientific applications use machine learning models to analyze sensor data and make real-time decisions. Their stringent latency and throughput requirements demand small batch sizes and r…

Read Paper →

Computer Science Preprint PDF DOI

Heuristic Search Space Partitioning for Low-Latency Multi-Tenant Cloud Queries

Prashant Kumar Pathak, Chandra Biksheswaran Mouleeswaran, Rama Teja Repaka · 2026

Large-scale cloud security platforms must continuously query millions of structured cloud resource records distributed across thousands of tenant accounts. Broad, account-spanning queries saturate dat…

Read Paper →

Computer Science Preprint PDF DOI

CHRONOS: A Hardware-Assisted Phase-Decoupled Framework for Secure Federated Learning in IoT

Hung Dang · 2026

We propose CHRONOS, a hardware-assisted framework that decouples the cryptographic setup required for private gradient aggregation from the active training phase. CHRONOS executes a once-per-epoch ser…

Read Paper →

Computer Science Preprint PDF DOI

Ocean: Fast Estimation-Based Sparse General Matrix-Matrix Multiplication on GPU

Yifan Li, Giulia Guidi · 2026

In computational science and data analytics, many workloads involve irregular and sparse computations that are inherently difficult to optimize for modern hardware. A key kernel is Sparse General Matr…

Read Paper →

Computer Science Preprint PDF DOI

A Comparative Analysis of ARM and x86-64 Laptop-Class Processors: Architecture, Assembly-Level Performance, and Energy Efficiency

Mustafa Mert Ozy{i}lmaz · 2026

ARM-based and x86-64 laptop processors differ not only in instruction-set design, but also in memory hierarchy, core organization, system integration, and power-management mechanisms. This study prese…

Read Paper →

Computer Science Preprint PDF DOI

High-Fidelity 3D Gaussian Human Reconstruction via Region-Aware Initialization and Geometric Priors

Yang Liu, Zhiyong Zhang · 2026

Real-time, high-fidelity 3D human reconstruction from RGB images is essential for interactive applications such as virtual reality and gaming, yet remains challenging due to the complex non-rigid defo…

Read Paper →

Computer Science Preprint PDF DOI

Optimizing Branch Predictor for Graph Applications

Upasna, Venkata Kalyan Tavva · 2026

Real-world graph applications are generally larger than the size of the cache itself. Due to this reason, the memory hierarchy was identified as a key bottleneck by the earlier works. Undoubtedly, the…

Read Paper →

Computer Science Preprint PDF DOI

Beyond Indistinguishability: Measuring Extraction Risk in LLM APIs

Ruixuan Liu, David Evans, Li Xiong · 2026

Indistinguishability properties such as differential privacy bounds or low empirically measured membership inference are widely treated as proxies to show a model is sufficiently protected against bro…

Read Paper →

Computer Science Preprint PDF DOI

HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing

Mao Lin, Xi Wang, Guilherme Cox, Dong Li, Hyeran Jeon · 2026

As modern LLMs support thousands to millions of tokens, KV caches grow to hundreds of gigabytes, stressing memory capacity and bandwidth. Existing solutions, such as KV cache pruning and offloading, a…

Read Paper →

Computer Science Preprint PDF DOI

Aligning Language Models for Lyric-to-Melody Generation with Rule-Based Musical Constraints

Hao Meng, Siyuan Zheng, Shuran Zhou, Qiangqiang Wang, Yang Song · 2026

Large Language Models (LLMs) show promise in lyric-to-melody generation, but models trained with Supervised Fine-Tuning (SFT) often produce musically implausible melodies with issues like poor rhythm …

Read Paper →

Computer Science Preprint PDF DOI

Balanced Co-Clustering of Users and Items for Embedding Table Compression in Recommender Systems

Runhao Jiang, Renchi Yang, Donghao Wu · 2026

Recommender systems have advanced markedly over the past decade by transforming each user/item into a dense embedding vector with deep learning models. At industrial scale, embedding tables constitute…

Read Paper →

Computer Science Preprint PDF DOI

AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization

Kosuke Matsushima, Yasuyuki Okoshi, Masato Motomura, Daichi Fujiki · 2026

Processing-in-Memory (PIM) architectures offer a promising solution to the memory bottlenecks in data-intensive machine learning, yet often overlook the growing challenge of activation memory footprin…

Read Paper →

Computer Science Preprint PDF DOI

Proxics: an efficient programming model for far memory accelerators

Zikai Liu, Niels Pressel, Jasmin Schult, Roman Meier, Pengcheng Xu, Timothy Roscoe · 2026

The use of disaggregated or far memory systems such as CXL memory pools has renewed interest in Near-Data Processing (NDP): situating cores close to memory to reduce bandwidth requirements to and from…

Read Paper →

Computer Science Preprint PDF DOI

Optimizing Memory Allocation in Distributed Clusters with Predictive Modeling

Jonathan Bader, Edgar Blumenthal, Marten Eckardt, Justus Krebs, Joel Witzke, Xemena Wysokinska, Haci Ismail Aslan, Odej Kao · 2026

In modern distributed systems, efficient resource allocation is a vital aspect to maintain scalability, reduce operational costs, and ensure fast execution even across heterogeneous workloads. Predict…

Read Paper →

Computer Science Preprint PDF DOI

Architecture Matters More Than Scale: A Comparative Study of Retrieval and Memory Augmentation for Financial QA Under SME Compute Constraints

Jianan Liu, Jing Yang, Xianyou Li, Weiran Yan, Yichao Wu, Penghao Liang, Mengwei Yuan · 2026

The rapid adoption of artificial intelligence (AI) and large language models (LLMs) is transforming financial analytics by enabling natural language interfaces for reporting, decision support, and aut…

Read Paper →

Browse Research Papers

DPC: A Distributed Page Cache over CXL

CROWDio: A Practical Mobile Crowd Computing Framework with Developer-Oriented Design, Adaptive Scheduling, and Fault Resilience

POLAR-PIC: A Holistic Framework for Matrixized PIC with Co-Designed Compute, Layout, and Communication

Energy Efficient LSTM Accelerators for Embedded FPGAs through Parameterised Architecture Design

A Simple Communication Scheme for Distributed Fast Multipole Methods

Design Rules for Extreme-Edge Scientific Computing on AI Engines

Heuristic Search Space Partitioning for Low-Latency Multi-Tenant Cloud Queries

CHRONOS: A Hardware-Assisted Phase-Decoupled Framework for Secure Federated Learning in IoT

Ocean: Fast Estimation-Based Sparse General Matrix-Matrix Multiplication on GPU

A Comparative Analysis of ARM and x86-64 Laptop-Class Processors: Architecture, Assembly-Level Performance, and Energy Efficiency

High-Fidelity 3D Gaussian Human Reconstruction via Region-Aware Initialization and Geometric Priors

Optimizing Branch Predictor for Graph Applications

Beyond Indistinguishability: Measuring Extraction Risk in LLM APIs

HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing

Aligning Language Models for Lyric-to-Melody Generation with Rule-Based Musical Constraints

Balanced Co-Clustering of Users and Items for Embedding Table Compression in Recommender Systems

AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization

Proxics: an efficient programming model for far memory accelerators

Optimizing Memory Allocation in Distributed Clusters with Predictive Modeling

Architecture Matters More Than Scale: A Comparative Study of Retrieval and Memory Augmentation for Financial QA Under SME Compute Constraints

Browse by Category

Research Type

Publish Your Research