Informatics in Computer Science — Research Repository

Computer Science Preprint PDF DOI

Unsafe and Unused? A History of Utility Code in Mature Open Source Projects

Brandon Keller, Kaitlin Yandik, Angela Ngo, Andy Meneely · 2026

Filenames are a concise means of conveying information about source code to fellow developers. One such convention is util. Commonly understood to stand for "utility", filenames with the letters util …

Read Paper →

Computer Science Preprint PDF DOI

Synthetic Biological Intelligence: System-Level Abstractions and Adaptive Bio-Digital Interaction

Martin Schottlender, Pengjie Zhou, Veronika Volkova, Fatima Rani, Ruifeng Zheng, Juan A. Cabrera, Frank H.P. Fitzek, Pit Hofmann · 2026

Concurrent advances across fields such as organoid technology, Microelectrode Arrays (MEAs), neuromorphic computing, and machine learning have given rise to a groundbreaking research paradigm: Synthet…

Read Paper →

Computer Science Preprint PDF DOI

CoNewsReader: Supporting Comprehensive Understanding and Raising Critical Thoughts on Social Media News Through Comments

Kangyu Yuan, Guanzheng Chen, Sizhe Liang, Hehai Lin, Qingyu Guo, Dingdong Liu, Xiaojuan Ma, Zhenhui Peng · 2026

Critical news reading (CNR), which requires grasping the holistic ideas of and raising critical thoughts on the news, is beneficial yet challenging for general people who usually get information on da…

Read Paper →

Computer Science Preprint PDF DOI

An Empirical Evaluation of Code Smell Detection in Angular Applications

Maykon Nunes, Emanuel Coutinho, Carla Bezerra, Ivan Machado · 2026

Angular is one of the most widely adopted frameworks for developing large-scale, dynamic web applications. As projects increase in scope and complexity, developers face growing challenges in managing …

Read Paper →

Computer Science Preprint PDF DOI

SimEval-IR: A Unified Toolkit and Benchmark Suite for Evaluating User Simulators and Search Sessions

Saber Zerhoudi · 2026

User simulators are increasingly central to interactive information retrieval, yet the community lacks standardized evaluation tools. Simulators serve two objectives, behavioral realism (matching real…

Read Paper →

Computer Science Preprint PDF DOI

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

Shiyao Peng, Qianhe Zheng, Zhuodi Hao, Zichen Tang, Rongjin Li, Qing Huang, Jiayu Huang, Jiacheng Liu, Yifan Zhu, Haihong E · 2026

Although precise recall is a core objective in Retrieval-Augmented Generation (RAG), a critical oversight persists in the field: improvements in retrieval performance do not consistently translate to …

Read Paper →

Computer Science Preprint PDF DOI

How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews

Riley Grossman, Songjiang Liu, Michael K. Chen, Mike Smith, Cristian Borcea, Yi Chen · 2026

Generative AI is being increasingly integrated into web search for the convenience it provides users. In this work, we aim to understand how generative AI disrupts web search by retrieving and present…

Read Paper →

Computer Science Preprint PDF DOI

Why Self-Supervised Encoders Want to Be Normal

Yuval Domb · 2026

We develop a geometric and information-theoretic framework for encoder-decoder learning built on the Information Bottleneck (IB) principle. Recasting IB as a rate-distortion problem with Kullback-Leib…

Read Paper →

Computer Science Preprint PDF DOI

Social Media Data Toolkit: Standardization and Anonymization of Social Network Datasets

Ali Najafi, Letizia Iannucci, Mikko Kivela, Onur Varol · 2026

The rapid diversification of social media platforms and the increasing restrictions on official APIs have significantly complicated cross-platform analysis. Researchers are often forced to rely on het…

Read Paper →

Computer Science Preprint PDF DOI

Purifying Multimodal Retrieval: Fragment-Level Evidence Selection for RAG

Xihang Wang, Zihan Wang, Chengkai Huang, Cao Liu, Ke Zeng, Quan Z. Sheng, Lina Yao · 2026

Multimodal Retrieval-Augmented Generation (MRAG) is widely adopted for Multimodal Large Language Models (MLLMs) with external evidence to reduce hallucinations. Despite its success, most existing MRAG…

Read Paper →

Computer Science Preprint PDF DOI

Knowledge Affordances for Hybrid Human-AI Information Seeking

Irene Celino · 2026

As information ecosystems grow more heterogeneous, both humans and artificial agents increasingly face a simple yet unresolved question: when seeking knowledge, whom should we ask, and why? Inspired b…

Read Paper →

Computer Science Preprint PDF DOI

SST-Guard: Detecting and Characterizing Server-Side Google Analytics in the Wild

Muhammad Jazlan, Alexander Gamero-Garrido, Zubair Shafiq, Yash Vekaria · 2026

As web browsers increasingly restrict client-side tracking, the web tracking ecosystem is shifting from client-side to server-side tracking (SST). In SST, the browser sends tracking requests to an int…

Read Paper →

Computer Science Preprint PDF DOI

CuLifter: Lifting GPU Binaries to Typed IR

Jisheng Zhao, Huanzhi Pu, Shinnung Jeong, Chihyo Ahn, Hyesoon Kim · 2026

GPU compilers merge all data types into a single unified register file, erasing the type information that binary-analysis tools rely on. We show that type recovery from this untyped register file is t…

Read Paper →

Computer Science Preprint PDF DOI

Gender Bias in YouTube Exposure: Allocative and Structural Inequalities in Political Information Environments

Jipeng Tan, Weifeng Zhang, Ye Wu, Jialin Guo, Yong Min · 2026

Recommendation algorithms have become the dominant mechanism for information distribution on digital platforms, profoundly shaping personalized information consumption environments. However, gender bi…

Read Paper →

Computer Science Preprint PDF DOI

Secure Cross-Silo Synthetic Genomic Data Generation

Daniil Filienko, Martine De Cock, Sikha Pentyala · 2026

Access to genomic data is highly regulated due to its sensitive nature. While safeguards are essential, cumbersome data access processes pose a significant barrier to the development of AI methods for…

Read Paper →

Computer Science Preprint PDF DOI

Tracking Conversations: Measuring Content and Identity Exposure on AI Chatbots

Muhammad Jazlan, Ethan Wang, Yash Vekaria, Zubair Shafiq · 2026

AI chatbots are becoming a primary interface for seeking information. As their popularity grows, chatbot providers are starting to deploy advertising and analytics. Despite this, tracking on AI chatbo…

Read Paper →

Computer Science Preprint PDF DOI

A Reproducibility Study of LLM-Based Query Reformulation

Amin Bigdeli, Radin Hamidi Rad, Hai Son Le, Mert Incesu, Negar Arabzadeh, Charles L. A. Clarke, Ebrahim Bagheri · 2026

Large Language Models (LLMs) are now widely used for query reformulation and expansion in Information Retrieval, with many studies reporting substantial effectiveness gains. However, these results are…

Read Paper →

Computer Science Preprint PDF DOI

Twitter climate discourse as a signal of pro-environmental behaviors

Edoardo Maggioni, Diego Garlaschelli, Rossana Mastrandrea, Luca Maria Aiello · 2026

Fostering coordinated pro-environmental behaviors at scale is a key challenge for climate mitigation. Individual actions only generate meaningful impact when they diffuse widely and become socially co…

Read Paper →

Computer Science Preprint PDF DOI

REBENCH: A Procedural, Fair-by-Construction Benchmark for LLMs on Stripped-Binary Types and Names (Extended Version)

Jun Yeon Won, Xin Jin, Shiqing Ma, Zhiqiang Lin · 2026

Large Language Models (LLMs) have achieved remarkable progress in recent years, driving their adoption across a wide range of domains, including computer security. In reverse engineering, LLMs are inc…

Read Paper →

Computer Science Preprint PDF DOI

NuggetIndex: Governed Atomic Retrieval for Maintainable RAG

Saber Zerhoudi, Michael Granitzer, Jelena Mitrovic · 2026

Retrieval-augmented generation (RAG) systems are frequently evaluated via fact-based metrics, yet standard implementations retrieve passages or static propositions. This unit mismatch between evaluati…

Read Paper →

Browse Research Papers

Unsafe and Unused? A History of Utility Code in Mature Open Source Projects

Synthetic Biological Intelligence: System-Level Abstractions and Adaptive Bio-Digital Interaction

CoNewsReader: Supporting Comprehensive Understanding and Raising Critical Thoughts on Social Media News Through Comments

An Empirical Evaluation of Code Smell Detection in Angular Applications

SimEval-IR: A Unified Toolkit and Benchmark Suite for Evaluating User Simulators and Search Sessions

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews

Why Self-Supervised Encoders Want to Be Normal

Social Media Data Toolkit: Standardization and Anonymization of Social Network Datasets

Purifying Multimodal Retrieval: Fragment-Level Evidence Selection for RAG

Knowledge Affordances for Hybrid Human-AI Information Seeking

SST-Guard: Detecting and Characterizing Server-Side Google Analytics in the Wild

CuLifter: Lifting GPU Binaries to Typed IR

Gender Bias in YouTube Exposure: Allocative and Structural Inequalities in Political Information Environments

Secure Cross-Silo Synthetic Genomic Data Generation

Tracking Conversations: Measuring Content and Identity Exposure on AI Chatbots

A Reproducibility Study of LLM-Based Query Reformulation

Twitter climate discourse as a signal of pro-environmental behaviors

REBENCH: A Procedural, Fair-by-Construction Benchmark for LLMs on Stripped-Binary Types and Names (Extended Version)

NuggetIndex: Governed Atomic Retrieval for Maintainable RAG

Browse by Category

Research Type

Publish Your Research