Marat Ablayev in Computer Science — Research Repository

Computer Science Preprint PDF DOI

Towards Certified Malware Detection: Provable Guarantees Against Evasion Attacks

Nandakrishna Giri, Asmitha K. A., Serena Nicolazzo, Antonino Nocera, Vinod P · 2026

Machine learning-based static malware detectors remain vulnerable to adversarial evasion techniques, such as metamorphic engine mutations. To address this vulnerability, we propose a certifiably robus…

Read Paper →

Computer Science Preprint PDF DOI

Log-based, Business-aware REST API Testing

Ding Yang, Ruixiang Qian, Zhao Wei, Zhenyu Chen, Chunrong Fang · 2026

REST APIs enable collaboration among microservices. A single fault in a REST API can bring down the entire microservice system and cause significant financial losses, underscoring the importance of RE…

Read Paper →

Computer Science Preprint PDF DOI

CARAT: Client-Side Adaptive RPC and Cache Co-Tuning for Parallel File Systems

Md Hasanur Rashid, Nathan R. Tallent, Forrest Sheng Bao, Dong Dai · 2026

Tuning parallel file system in High-Performance Computing (HPC) systems remains challenging due to the complex I/O paths, diverse I/O patterns, and dynamic system conditions. While existing autotuning…

Read Paper →

Computer Science Preprint PDF DOI

MARA: A Multimodal Adaptive Retrieval-Augmented Framework for Document Question Answering

Hui Wu, Haoquan Zhai, Yuchen Li, Hengyi Cai, Peirong Zhang, Yidan Zhang, Lei Wang, Chunle Wang, Yingyan Hou, Shuaiqiang Wang, Dawei Yin · 2026

Retrieval-based multimodal document QA aims to identify and integrate relevant information from visually rich documents with complex multimodal structures. While retrieval-augmented generation (RAG) h…

Read Paper →

Computer Science Preprint PDF DOI

Aletheia: What Makes RLVR For Code Verifiers Tick?

Vatsal Venkatkrishna, Indraneil Paul, Iryna Gurevych · 2026

Multi-domain thinking verifiers trained via Reinforcement Learning with Verifiable Rewards (RLVR) are a cornerstone of modern post-training. However, their adoption in code generation has lagged behin…

Read Paper →

Computer Science Preprint PDF DOI

Measuring the benefits of lying in MARA under egalitarian social welfare

Jonathan Carrero, Ismael Rodriguez, Fernando Rubio · 2026

When some resources are to be distributed among a set of agents following egalitarian social welfare, the goal is to maximize the utility of the agent whose utility turns out to be minimal. In this co…

Read Paper →

Computer Science Preprint PDF DOI

Question Answering for Multi-Release Systems: A Case Study at Ciena

Parham Khamsepour, Mark Cole, Ish Ashraf, Sandeep Puri, Mehrdad Sabetzadeh, Shiva Nejati · 2026

Companies regularly have to contend with multi-release systems, where several versions of the same software are in operation simultaneously. Question answering over documents from multi-release system…

Read Paper →

Computer Science Preprint PDF DOI

The Mental World of Large Language Models in Recommendation: A Benchmark on Association, Personalization, and Knowledgeability

Guangneng Hu · 2025

Large language models (LLMs) have shown potential in recommendation systems (RecSys) by using them as either knowledge enhancer or zero-shot ranker. A key challenge lies in the large semantic gap betw…

Read Paper →

Computer Science Preprint PDF DOI

Explainable Multi-Modal Deep Learning for Automatic Detection of Lung Diseases from Respiratory Audio Signals

S M Asiful Islam Saky, Md Rashidul Islam, Md Saiful Arefin, Shahaba Alam · 2025

Respiratory diseases remain major global health challenges, and traditional auscultation is often limited by subjectivity, environmental noise, and inter-clinician variability. This study presents an …

Read Paper →

Computer Science Preprint PDF DOI

ExplainableGuard: Interpretable Adversarial Defense for Large Language Models Using Chain-of-Thought Reasoning

Shaowei Guan, Yu Zhai, Zhengyu Zhang, Yanze Wang, Hin Chi Kwok · 2025

Large Language Models (LLMs) are increasingly vulnerable to adversarial attacks that can subtly manipulate their outputs. While various defense mechanisms have been proposed, many operate as black box…

Read Paper →

Computer Science Preprint PDF DOI

SARSteer: Safeguarding Large Audio Language Models via Safe-Ablated Refusal Steering

Weilin Lin, Jianze Li, Hui Xiong, Li Liu · 2025

Large Audio-Language Models (LALMs) are becoming essential as a powerful multimodal backbone for real-world applications. However, recent studies show that audio inputs can more easily elicit harmful …

Read Paper →

Computer Science Preprint PDF DOI

On the false election between regulation and innovation. Ideas for regulation through the responsible use of artificial intelligence in research and education.[Spanish version]

Pompeu Casanovas (IIIA-CSIC) · 2025

This short essay is a reworking of the answers offered by the author at the Debate Session of the AIHUB (CSIC) and EduCaixa Summer School, organized by Marta Garcia-Matos and Lissette Lemus, and coord…

Read Paper →

Computer Science Preprint PDF DOI

The Impact of Critique on LLM-Based Model Generation from Natural Language: The Case of Activity Diagrams

Parham Khamsepour, Mark Cole, Ish Ashraf, DaYuan Tan, Sandeep Puri, Mehrdad Sabetzadeh, Shiva Nejati · 2025

Large Language Models (LLMs) show strong potential for automating model generation from natural-language descriptions. A common approach begins with an initial model generation, followed by an iterati…

Read Paper →

Computer Science Preprint PDF DOI

WFC/WFD: Web Fuzzing Commons, Dataset and Guidelines to Support Experimentation in REST API Fuzzing

Omur Sahin, Man Zhang, Andrea Arcuri · 2025

Fuzzing REST APIs is an important research problem, with practical applications and impact in industry. As such, a lot of research work has been carried out on this topic in the last few years. Howeve…

Read Paper →

Computer Science Preprint PDF DOI

Generative Recommendation with Semantic IDs: A Practitioner's Handbook

Clark Mingxuan Ju, Liam Collins, Leonardo Neves, Bhuvesh Kumar, Louis Yufeng Wang, Tong Zhao, Neil Shah · 2025

Generative recommendation (GR) has gained increasing attention for its promising performance compared to traditional models. A key factor contributing to the success of GR is the semantic ID (SID), wh…

Read Paper →

Computer Science Preprint PDF DOI

Generating Highly Structured Test Inputs Leveraging Constraint-Guided Graph Refinement

Zhaorui Yang, Yuxin Qiu, Haichao Zhu, Qian Zhang · 2025

[Context] Modern AI applications increasingly process highly structured data, such as 3D meshes and point clouds, where test input generation must preserve both structural and semantic validity. Howev…

Read Paper →

Computer Science Preprint PDF DOI

Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis

Rafi Al Attrach, Pedro Moreira, Rajna Fani, Renato Umeton, Amelia Fiske, Leo Anthony Celi · 2025

Large-scale clinical databases offer opportunities for medical research, but their complexity creates barriers to effective use. The Medical Information Mart for Intensive Care (MIMIC-IV), one of the …

Read Paper →

Computer Science Preprint PDF DOI

SheetMind: An End-to-End LLM-Powered Multi-Agent Framework for Spreadsheet Automation

Ruiyan Zhu, Xi Cheng, Ke Liu, Brian Zhu, Daniel Jin, Neeraj Parihar, Zhoutian Xu, Oliver Gao · 2025

We present SheetMind, a modular multi-agent framework powered by large language models (LLMs) for spreadsheet automation via natural language instructions. The system comprises three specialized agent…

Read Paper →

Computer Science Preprint PDF DOI

Disentangling Locality and Entropy in Ranking Distillation

Andrew Parry, Debasis Ganguly, Sean MacAvaney · 2025

The training process of ranking models involves two key data selection decisions: a sampling strategy, and a labeling strategy. Modern ranking systems, especially those for performing semantic search,…

Read Paper →

Computer Science Preprint PDF DOI

Policy Testing with MDPFuzz (Replicability Study)

Quentin Mazouni, Helge Spieker, Arnaud Gotlieb, Mathieu Acher · 2025

In recent years, following tremendous achievements in Reinforcement Learning, a great deal of interest has been devoted to ML models for sequential decision-making. Together with these scientific brea…

Read Paper →

Browse Research Papers

Towards Certified Malware Detection: Provable Guarantees Against Evasion Attacks

Log-based, Business-aware REST API Testing

CARAT: Client-Side Adaptive RPC and Cache Co-Tuning for Parallel File Systems

MARA: A Multimodal Adaptive Retrieval-Augmented Framework for Document Question Answering

Aletheia: What Makes RLVR For Code Verifiers Tick?

Measuring the benefits of lying in MARA under egalitarian social welfare

Question Answering for Multi-Release Systems: A Case Study at Ciena

The Mental World of Large Language Models in Recommendation: A Benchmark on Association, Personalization, and Knowledgeability

Explainable Multi-Modal Deep Learning for Automatic Detection of Lung Diseases from Respiratory Audio Signals

ExplainableGuard: Interpretable Adversarial Defense for Large Language Models Using Chain-of-Thought Reasoning

SARSteer: Safeguarding Large Audio Language Models via Safe-Ablated Refusal Steering

On the false election between regulation and innovation. Ideas for regulation through the responsible use of artificial intelligence in research and education.[Spanish version]

The Impact of Critique on LLM-Based Model Generation from Natural Language: The Case of Activity Diagrams

WFC/WFD: Web Fuzzing Commons, Dataset and Guidelines to Support Experimentation in REST API Fuzzing

Generative Recommendation with Semantic IDs: A Practitioner's Handbook

Generating Highly Structured Test Inputs Leveraging Constraint-Guided Graph Refinement

Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis

SheetMind: An End-to-End LLM-Powered Multi-Agent Framework for Spreadsheet Automation

Disentangling Locality and Entropy in Ranking Distillation

Policy Testing with MDPFuzz (Replicability Study)

Browse by Category

Research Type

Publish Your Research