Kevin Huck in Computer Science — Research Repository

Computer Science Preprint PDF DOI

Agentic AI in the Software Development Lifecycle: Architecture, Empirical Evidence, and the Reshaping of Software Engineering

Happy Bhati · 2026

The arrival of large language models (LLMs) capable of multi-step reasoning, tool use, and long-horizon planning has produced a qualitative shift in software engineering. Where earlier code-completion…

Read Paper →

Computer Science Preprint PDF DOI

MAS-SZZ: Multi-Agentic SZZ Algorithm for Vulnerability-Inducing Commit Identification

Sicong Cao, Jinxuan Xu, Le Yu, Jing Yang, Xingwei Lin, Linlin Zhu, Fu Xiao · 2026

Accurate vulnerability-inducing commit identification serves as a foundation for a series of software security tasks, such as vulnerability detection and affected version analysis. A straightforward s…

Read Paper →

Computer Science Preprint PDF DOI

Automated Classification of Human Code Review Comments with Large Language Models

Semih Caglar, Sukru Eren Gok{i}rmak, Eray Tuzun · 2026

Context: Code reviews are essential for maintaining software quality, yet many human review comments suffer from issues such as redundancy, vagueness, or lack of constructiveness. These types of comme…

Read Paper →

Computer Science Preprint PDF DOI

Reliability of AI Bots Footprints in GitHub Actions CI/CD Workflows

Syed Muhammad Ashhar Shah, Sehrish Habib, Muizz Hussain, Maryam Abdul Ghafoor, Abdul Ali Bangash · 2026

Continuous Integration and Deployment (CI/CD) workflows are central to modern software delivery, yet the reliability of agentic AI bots operating within these workflows remain underexplored. Using pul…

Read Paper →

Computer Science Preprint PDF DOI

Terminal Wrench: A Dataset of 331 Reward-Hackable Environments and 3,632 Exploit Trajectories

Ivan Bercovich, Ivgeni Segal, Kexun Zhang, Shashwat Saxena, Aditi Raghunathan, Ziqian Zhong · 2026

We release Terminal Wrench, a subset of 331 terminal-agent benchmark environments, copied from the popular open benchmarks that are demonstrably reward-hackable. The data set includes 3,632 hack traje…

Read Paper →

Computer Science Preprint PDF DOI

A Complementary Visualisation Suite for Empirical Performance Analysis: Tempographs, Histograms, Ridgeline Plots, Stacked Bar Charts, and Combination Charts Applied to Beethoven's Piano and Cello Sonatas

Ignasi Sole · 2026

The choice of visualisation in empirical performance analysis is not a neutral presentation decision but an analytical one: different graphical forms reveal different features of the same dataset, and…

Read Paper →

Computer Science Preprint PDF DOI

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

Zijie Zhao, Chenyuan Yang, Weidong Wang, Yihan Yang, Ziqi Zhang, Lingming Zhang · 2026

While recent LLM-based agents can identify many candidate bugs in source code, their reports remain static hypotheses that require manual validation, limiting the practicality of automated bug detecti…

Read Paper →

Computer Science Preprint PDF DOI

LOCARD: An Agentic Framework for Blockchain Forensics

Xiaohang Yu, William Knottenbelt · 2026

Blockchain forensics inherently involves dynamic and iterative investigations, while many existing approaches primarily model it through static inference pipelines. We propose a paradigm shift towards…

Read Paper →

Computer Science Preprint PDF DOI

Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

Razvan Mihai Popescu, David Gros, Andrei Botocan, Rahul Pandita, Prem Devanbu, Maliheh Izadi · 2026

The rise of large language models for code has reshaped software development. Autonomous coding agents, able to create branches, open pull requests, and perform code reviews, now actively contribute t…

Read Paper →

Computer Science Preprint PDF DOI

Code Review Agent Benchmark

Yuntong Zhang, Zhiyuan Pan, Imam Nur Bani Yusuf, Haifeng Ruan, Ridwan Shariffdeen, Abhik Roychoudhury · 2026

Software engineering agents have shown significant promise in writing code. As AI agents permeate code writing, and generate huge volumes of code automatically -- the matter of code quality comes fron…

Read Paper →

Computer Science Preprint PDF DOI

TA-Mem: Tool-Augmented Autonomous Memory Retrieval for LLM in Long-Term Conversational QA

Mengwei Yuan, Jianan Liu, Jing Yang, Xianyou Li, Weiran Yan, Yichao Wu, Penghao Liang · 2026

Large Language Model (LLM) has exhibited strong reasoning ability in text-based contexts across various domains, yet the limitation of context window poses challenges for the model on long-range infer…

Read Paper →

Computer Science Preprint PDF DOI

Can LLMs Hack Enterprise Networks? -- Replicated Computational Results (RCR) Report

Andreas Happe, Jurgen Cito · 2026

This is the Replicated Computational Results (RCR) Report for the paper ``Can LLMs Hack Enterprise Networks?" The paper empirically investigates the efficacy and effectiveness of different LLMs for pe…

Read Paper →

Computer Science Preprint PDF DOI

ATLAS: AI-Assisted Threat-to-Assertion Learning for System-on-Chip Security Verification

Ishraq Tashdid, Kimia Tasnia, Alexander Garcia, Jonathan Valamehr, Sazadur Rahman · 2026

This work presents ATLAS, an LLM-driven framework that bridges standardized threat modeling and property-based formal verification for System-on-Chip (SoC) security. Starting from vulnerability knowle…

Read Paper →

Computer Science Preprint PDF DOI

DiffBMP: Differentiable Rendering with Bitmap Primitives

Seongmin Hong, Junghun James Kim, Daehyeop Kim, Insoo Chung, Se Young Chun · 2026

We introduce DiffBMP, a scalable and efficient differentiable rendering engine for a collection of bitmap images. Our work addresses a limitation that traditional differentiable renderers are constrai…

Read Paper →

Computer Science Preprint PDF DOI

Rapid Testing, Duck Lips, and Tilted Cameras: Youth Everyday Algorithm Auditing Practices with Generative AI Filters

Lauren Vogelstein, Vedya Konda, Deborah Fields, Yasmin Kafai, Luis Morales-Navarro, Danae Metaxa · 2026

Today's youth have extensive experience interacting with artificial intelligence and machine learning applications on popular social media platforms, putting youth in a unique position to examine, eva…

Read Paper →

Computer Science Preprint PDF DOI

CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

Jingwei Shi, Xinxiang Yin, Jing Huang, Jinman Zhao, Shengyu Tao · 2026

The evaluation of Large Language Models (LLMs) for code generation relies heavily on the quality and robustness of test cases. However, existing benchmarks often lack coverage for subtle corner cases,…

Read Paper →

Computer Science Preprint PDF DOI

Should I Hide My Duck in the Lake?

Jonas Dann, Gustavo Alonso · 2026

Data lakes spend a significant fraction of query execution time on scanning data from remote storage. Decoding alone accounts for 46% of runtime when running TPC-H directly on Parquet files. To addres…

Read Paper →

Computer Science Preprint PDF DOI

AIDev: Studying AI Coding Agents on GitHub

Hao Li, Haoxiang Zhang, Ahmed E. Hassan · 2026

AI coding agents are rapidly transforming software engineering by performing tasks such as feature development, debugging, and testing. Despite their growing impact, the research community lacks a com…

Read Paper →

Computer Science Preprint PDF DOI

Comparing AI Coding Agents: A Task-Stratified Analysis of Pull Request Acceptance

Giovanni Pinna, Jingzhi Gong, David Williams, Federica Sarro · 2026

The rapid adoption of AI-powered coding assistants is transforming software development practices, yet systematic comparisons of their effectiveness across different task types and over time remain li…

Read Paper →

Computer Science Preprint PDF DOI

Why Agentic-PRs Get Rejected: A Comparative Study of Coding Agents

Sota Nakashima, Yuta Ishimoto, Masanari Kondo, Shane Mclntosh, Yasutaka Kamei · 2026

Agentic coding -- software development workflows in which autonomous coding agents plan, implement, and submit code changes with minimal human involvement -- is rapidly gaining traction. Prior work ha…

Read Paper →

Browse Research Papers

Agentic AI in the Software Development Lifecycle: Architecture, Empirical Evidence, and the Reshaping of Software Engineering

MAS-SZZ: Multi-Agentic SZZ Algorithm for Vulnerability-Inducing Commit Identification

Automated Classification of Human Code Review Comments with Large Language Models

Reliability of AI Bots Footprints in GitHub Actions CI/CD Workflows

Terminal Wrench: A Dataset of 331 Reward-Hackable Environments and 3,632 Exploit Trajectories

A Complementary Visualisation Suite for Empirical Performance Analysis: Tempographs, Histograms, Ridgeline Plots, Stacked Bar Charts, and Combination Charts Applied to Beethoven's Piano and Cello Sonatas

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

LOCARD: An Agentic Framework for Blockchain Forensics

Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

Code Review Agent Benchmark

TA-Mem: Tool-Augmented Autonomous Memory Retrieval for LLM in Long-Term Conversational QA

Can LLMs Hack Enterprise Networks? -- Replicated Computational Results (RCR) Report

ATLAS: AI-Assisted Threat-to-Assertion Learning for System-on-Chip Security Verification

DiffBMP: Differentiable Rendering with Bitmap Primitives

Rapid Testing, Duck Lips, and Tilted Cameras: Youth Everyday Algorithm Auditing Practices with Generative AI Filters

CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

Should I Hide My Duck in the Lake?

AIDev: Studying AI Coding Agents on GitHub

Comparing AI Coding Agents: A Task-Stratified Analysis of Pull Request Acceptance

Why Agentic-PRs Get Rejected: A Comparative Study of Coding Agents

Browse by Category

Research Type

Publish Your Research