Sarah Scheffler — Research Repository

AI & Data Science Preprint PDF DOI

RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses

Feiyu Wu, Xu Zheng, Zhuocheng Wang, Yi ming Dai, Hui Li · 2026

Large language models (LLMs) make reward design in reinforcement learning substantially more scalable, but generated rewards are not automatically reliable training objectives. Existing work has focus…

Read Paper →

AI & Data Science Preprint PDF DOI

Hierarchical adaptive control for real-time dynamic inference at the edge

Francesco Daghero, Mahyar Tourchi Moghaddam, Mikkel Baun Kj{ae}rgaard · 2026

Industrial systems increasingly depend on Machine Learning (ML), and operate on heterogeneous nodes that must satisfy tight latency, energy, and memory constraints. Dynamic ML models, which reconfigur…

Read Paper →

Computer Science Preprint PDF DOI

RAG-Enhanced Kernel-Based Heuristic Synthesis (RKHS): A Structured Methodology Using Large Language Models for Hardware Design

Shiva Ahir, Alex Doboli · 2026

Heuristic design upholds modern electronic design automation (EDA) tools, yet crafting effective placement, routing, and scheduling strategies entails substantial expertise. We study how large languag…

Read Paper →

Computer Science Preprint PDF DOI

NeuralEmu: in situ Measurement-Driven, ML-based, High-Fidelity 5G Network Emulation

Haoran Wan, Yaxiong Xie, Kyle Jamieson · 2026

Current and future applications demand ultra-low latency and consistent throughput, yet frequently traverse 5G cellular networks, so cope with volatile packet dynamics, as 5G base station schedulers d…

Read Paper →

Computer Science Preprint PDF DOI

NVLLM: A 3D NAND-Centric Architecture Enabling Edge on-Device LLM Inference

Mingbo Hao, Changwei Yan, Haoyu Cui, Zhihao Yan, Yizhi Ding, Zhangrui Qian, Weiwei Shan · 2026

The rapid growth of LLMs demands high-throughput, memory-capacity-intensive inference on resource-constrained edge devices, where single-batch decoding remains fundamentally memory-bound. Existing out…

Read Paper →

Physics Preprint PDF DOI

No Tile Left Behind: Multiprogramming for Surface-Code Architectures

Archisman Ghosh, Avimita Chatterjee, Swaroop Ghosh · 2026

Fault-tolerant quantum computing (FTQC) is emerging as the architectural regime in which practical large-scale quantum workloads will execute. In this setting, however, multiprogramming is no longer a…

Read Paper →

Computer Science Preprint PDF DOI

CacheFlow: Efficient LLM Serving with 3D-Parallel KV Cache Restoration

Sean Nian, Jiahao Fang, Qilong Feng, Zhiyu Wu, Fan Lai · 2026

KV cache restoration has emerged as a dominant bottleneck in serving long-context LLM workloads, including multi-turn conversations, retrieval-augmented generation, and agentic pipelines. Existing app…

Read Paper →

AI & Data Science Preprint PDF DOI

Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models

Hailing Cheng, Tao Huang, Chen Zhu, Antonio Alonso · 2026

Training large neural networks with data-parallel stochastic gradient descent allocates N GPU replicas to compute effectively identical updates -- a practice that leaves the rich space of learning rat…

Read Paper →

AI & Data Science Preprint PDF DOI

Enhanced Privacy and Communication Efficiency in Non-IID Federated Learning with Adaptive Quantization and Differential Privacy

Emre Ard{i}c, Yakup Genc · 2026

Federated learning (FL) is a distributed machine learning method where multiple devices collaboratively train a model under the management of a central server without sharing underlying data. One of t…

Read Paper →

AI & Data Science Preprint PDF DOI

Region Matters: Efficient and Reliable Region-Aware Visual Place Recognition

Shunpeng Chen, Yukun Song, Changwei Wang, Rongtao Xu, Kexue Fu, Longxiang Gao, Li Guo, Ruisheng Wang, Shibiao Xu · 2026

Visual Place Recognition (VPR) determines a query image's geographic location by matching it against geotagged databases. However, existing methods struggle with perceptual aliasing caused by irreleva…

Read Paper →

Physics Preprint PDF DOI

Understanding HWO's Field of Regard and Characterization Requirement Trade Space with a Dynamic Observation Scheduling Algorithm

Corey Spohn, Christopher C. Stark, Dmitry Savransky, Natasha Latouf · 2026

The Habitable Worlds Observatory (HWO) aims to image and characterize at least 25 ExoEarth candidates (EECs). Achieving this goal requires a detailed understanding of the observatory's design trade sp…

Read Paper →

Computer Science Preprint PDF DOI

FEPLB: Exploiting Copy Engines for Nearly Free MoE Load Balancing in Distributed Training

Shuyao Qi, Haoyuan Liu, Shizhen Zhao · 2026

Fine-grained, per-micro-batch load balancing is essential for efficient Mixture-of-Experts (MoE) training, yet every prior dynamic scheduling scheme pays for it with extra communication that is hard t…

Read Paper →

Engineering Preprint PDF DOI

QoS-Constrained Scheduling in Multi-Cell Multi-User MIMO Networks

Tenghao Cai, Lei Li, Tsung-Hui Chang · 2026

In 5G and beyond networks, efficient scheduling is essential to exploit the gains of multi-user MIMO (MU-MIMO) equipped with carrier aggregation and joint transmission (JT). However, cross-cell and cr…

Read Paper →

AI & Data Science Preprint PDF DOI

Multi-Domain Learning with Global Expert Mapping

Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Oscar Mendez, Dacheng Tao, Xuelong Li · 2026

Human perception generalizes well across different domains, but most vision models struggle beyond their training data. This gap motivates multi-dataset learning, where a single model is trained on di…

Read Paper →

Computer Science Preprint PDF DOI

HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing

Mao Lin, Xi Wang, Guilherme Cox, Dong Li, Hyeran Jeon · 2026

As modern LLMs support thousands to millions of tokens, KV caches grow to hundreds of gigabytes, stressing memory capacity and bandwidth. Existing solutions, such as KV cache pruning and offloading, a…

Read Paper →

Engineering Preprint PDF DOI

Joint Scheduling of Multi-Band Radar Sensing and DNN Inference for Cross-Stage Parallelism

Yanan Du, Sai Xu, Kezhi Wang, Yansha Deng · 2026

This paper studies end-to-end latency minimization for a multi-band radar sensing and deep neural network (DNN) inference pipeline. Unlike conventional stage-wise designs that treat radar sensing and …

Read Paper →

AI & Data Science Preprint PDF DOI

Test-Time Perturbation Learning with Delayed Feedback for Vision-Language-Action Models

Zehua Zang, Xi Wang, Fuchun Sun, Xiao Xu, Lixiang Lium, Jiahuan Zhou, Jiangmeng Li · 2026

Vision-Language-Action models (VLAs) achieve remarkable performance in sequential decision-making but remain fragile to subtle environmental shifts, such as small changes in object pose. We attribute …

Read Paper →

Computer Science Preprint PDF DOI

MASFuzzer: Fuzz Driver Generation and Adaptive Scheduling via Multidimensional API Sequences

Xingyu Liu, Zengqin Huang, Xiang Gao, Hailong Sun · 2026

Fuzz testing of software libraries relies on fuzz drivers to invoke library APIs. Traditionally, these drivers are written manually by developers - a process that is time-consuming and often inadequat…

Read Paper →

Computer Science Preprint PDF DOI

Towards Energy Efficient Co-Scheduling in HPC

Zhong Zheng, Michael E. Papka, Zhiling Lan · 2026

Modern multi GPU HPC systems expose substantial computational capacity, yet inefficient GPU allocation often leads to wasted energy and underutilization. In practice, GPU applications exhibit heteroge…

Read Paper →

Mathematics Preprint PDF DOI

Orderings of Generalized k-Markov Numbers

Esther Banaian, Min Huang · 2026

A $k$-Markov number is a positive integer that appears in a positive integral solution to the Diophantine equation $x^2 + y^2 + z^2 + k(xy + xz + yz) = (3+3k)xyz$. This equation was introduced by Gyod…

Read Paper →

Browse Research Papers

RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses

Hierarchical adaptive control for real-time dynamic inference at the edge

RAG-Enhanced Kernel-Based Heuristic Synthesis (RKHS): A Structured Methodology Using Large Language Models for Hardware Design

NeuralEmu: in situ Measurement-Driven, ML-based, High-Fidelity 5G Network Emulation

NVLLM: A 3D NAND-Centric Architecture Enabling Edge on-Device LLM Inference

No Tile Left Behind: Multiprogramming for Surface-Code Architectures

CacheFlow: Efficient LLM Serving with 3D-Parallel KV Cache Restoration

Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models

Enhanced Privacy and Communication Efficiency in Non-IID Federated Learning with Adaptive Quantization and Differential Privacy

Region Matters: Efficient and Reliable Region-Aware Visual Place Recognition

Understanding HWO's Field of Regard and Characterization Requirement Trade Space with a Dynamic Observation Scheduling Algorithm

FEPLB: Exploiting Copy Engines for Nearly Free MoE Load Balancing in Distributed Training

QoS-Constrained Scheduling in Multi-Cell Multi-User MIMO Networks

Multi-Domain Learning with Global Expert Mapping

HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing

Joint Scheduling of Multi-Band Radar Sensing and DNN Inference for Cross-Stage Parallelism

Test-Time Perturbation Learning with Delayed Feedback for Vision-Language-Action Models

MASFuzzer: Fuzz Driver Generation and Adaptive Scheduling via Multidimensional API Sequences

Towards Energy Efficient Co-Scheduling in HPC

Orderings of Generalized k-Markov Numbers

Browse by Category

Research Type

Publish Your Research