Sports — Research Repository | Expertini Research

AI & Data Science Preprint PDF

From Stochastic to Deterministic: A Multi-Criteria Decision Analysis Framework for Bounded Semantic Parsing in AI-Driven Recruitment Screening

A. H. Syed · 2026

The tension between automation and accuracy sits at the heart of modern talent acquisition. Recruiters need swiftness. Organisations need secure, auditable decisions. And candidates—often talented ind…

Read Paper →

AI & Data Science Preprint PDF DOI

LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis

Lincan Li, Zheng Chen, Yushun Dong · 2026

Electroencephalogram (EEG) signals are vital for automated seizure detection, but their inherent noise makes robust representation learning challenging. Existing graph construction methods, whether co…

Read Paper →

AI & Data Science Preprint PDF DOI

What Makes a Good Terminal-Agent Benchmark Task: A Guideline for Adversarial, Difficult, and Legible Evaluation Design

Ivan Bercovich · 2026

Terminal-agent benchmarks have become a primary signal for measuring the coding and system-administration capabilities of large language models. As the market for evaluation environments grows, so doe…

Read Paper →

Computer Science Preprint PDF DOI

To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems

Shreya Chappidi, Jatinder Singh · 2026

Responsible AI research typically focuses on examining the use and impacts of deployed AI systems. Yet, there is currently limited visibility into the pre-deployment decisions to pursue building such …

Read Paper →

AI & Data Science Preprint PDF DOI

Shuffling-Aware Optimization for Private Vector Mean Estimation

Shun Takagi, Seng Pei Liew · 2026

We study $d$-dimensional unbiased mean estimation in the single-message shuffle model, where each user sends a single privatized message and the analyzer only observes the shuffled multiset of reports…

Read Paper →

AI & Data Science Preprint PDF DOI

Models Recall What They Violate: Constraint Adherence in Multi-Turn LLM Ideation

Garvin Kruthof · 2026

When researchers iteratively refine ideas with large language models, do the models preserve fidelity to the original objective? We introduce DriftBench, a benchmark for evaluating constraint adherenc…

Read Paper →

Computer Science Preprint PDF DOI

From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation

Guang Yang, Xing Hu, Xiang Chen, Xin Xi · 2026

Multimodal large language models (MLLMs) are increasingly used to translate visual artifacts into code, from UI mockups into HTML to scientific plots into Python scripts. A circuit diagram can be view…

Read Paper →

AI & Data Science Preprint PDF DOI

Training-Free Tunnel Defect Inspection and Engineering Interpretation via Visual Recalibration and Entity Reconstruction

Shipeng Liu, Liang Zhao, Dengfeng Chen, Zhanping Song · 2026

Tunnel inspection requires outputs that can support defect localization, measurement, severity grading, and engineering documentation. Existing training-free foundation-model pipelines usually stop at…

Read Paper →

AI & Data Science Preprint PDF DOI

Geometry-Calibrated Conformal Abstention for Language Models

Rui Xu, Yi Chen, Sihong Xie, Hui Xiong · 2026

When language models lack relevant knowledge for a given query, they frequently generate plausible responses that can be hallucinations, rather than admitting being agnostic about the answer. Retraini…

Read Paper →

AI & Data Science Preprint PDF DOI

Simulating clinical interventions with a generative multimodal model of human physiology

Guy Lutsker, Gal Sapir, Jordi Merino, Smadar Shilo, Anastasia Godneva, Eli Meirom, Shie Mannor, Hagai Rossman, Gal Chechik, Eran Segal · 2026

Understanding how human health changes over time, and why responses to interventions vary between individuals, remains a central challenge in medicine. Here we present HealthFormer, a decoder-only tra…

Read Paper →

Computer Science Preprint PDF DOI

An Empirical Evaluation of Code Smell Detection in Angular Applications

Maykon Nunes, Emanuel Coutinho, Carla Bezerra, Ivan Machado · 2026

Angular is one of the most widely adopted frameworks for developing large-scale, dynamic web applications. As projects increase in scope and complexity, developers face growing challenges in managing …

Read Paper →

AI & Data Science Preprint PDF DOI

In-Context Prompting Obsoletes Agent Orchestration for Procedural Tasks

Simon Dennis, Michael Diamond, Rivaan Patil, Kevin Shabahang, Hao Guo · 2026

Agent orchestration frameworks -- LangGraph, CrewAI, Google ADK, OpenAI Agents SDK, and others -- place an external orchestrator above the LLM, tracking state and injecting routing instructions at eve…

Read Paper →

AI & Data Science Preprint PDF DOI

KellyBench: A Benchmark for Long-Horizon Sequential Decision Making

Thomas Grady, Kip Parker, Iliyan Zarov, Henry Course, Chengxi Taylor, Ross Taylor · 2026

Language models are saturating benchmarks for procedural tasks with narrow objectives. But they are increasingly being deployed in long-horizon, non-stationary environments with open-ended goals. In t…

Read Paper →

Computer Science Preprint PDF DOI

RuC: HDL-Agnostic Rule Completion Benchmark Generation

Arnau Ayguade Domingo, Miquel Alberti-Binimelis, Cristian Gutierrez-Gomez, Emanuele Parisi, Razine Moundir Ghorab, Miquel Moreto, Gokcen Kestor, Dario Garcia-Gasulla · 2026

Large Language Models (LLMs) have rapidly improved in performance across code-related tasks, making their integration into Register Transfer Level (RTL) development increasingly attractive. Mimicking …

Read Paper →

Computer Science Preprint PDF DOI

LLM-as-a-Judge for Human-AI Co-Creation: A Reliability-Aware Evaluation Framework for Coding

Md Faizul Ibne Amin, Yutaka Watanobe, Daniel M. Muepu, Haruto Suzuki, Kenta Nanaumi, Md Mostafizer Rahman · 2026

LLMs are increasingly employed both as judges for evaluating open-ended outputs and as co-creation partners in AI-assisted programming; yet rigorous evaluation in human-AI co-creation settings remains…

Read Paper →

AI & Data Science Preprint PDF DOI

Knowledge Graph Representations for LLM-Based Policy Compliance Reasoning

Wilder Baldwin, Sepideh Ghanavati · 2026

The risks posed by AI features are increasing as they are rapidly integrated into software applications. In response, regulations and standards for safe and secure AI have been proposed. In this paper…

Read Paper →

AI & Data Science Preprint PDF DOI

One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness

Hiroyuki Deguchi, Katsuki Chousa, Yusuke Sakai · 2026

The hubness problem, in which hub embeddings are close to many unrelated examples, occurs often in high-dimensional embedding spaces and may pose a practical threat for purposes such as information re…

Read Paper →

Computer Science Preprint PDF DOI

HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs

Chang-Chih Meng, Yu-Ren Lu, Guan-Yu Lin, Tsung Tai Yeh, Kai-Chiang Wu, I-Chen Wu · 2026

Integrated Circuit (IC) verification consumes nearly 70% of the IC development cycle, and recent research leverages Large Language Models (LLMs) to automatically generate testbenches and reduce verifi…

Read Paper →

AI & Data Science Preprint PDF DOI

WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning

Ke Xu · 2026

We present WaferSAGE, a framework for wafer defect visual question answering using small vision-language models. To address data scarcity in semiconductor manufacturing, we propose a three-stage synth…

Read Paper →

AI & Data Science Preprint PDF DOI

Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs

Naomi Esposito, Anthony Tricarico, Luisa Porzio, Ali Aghazadeh Ardebili, Massimo Stella · 2026

To enhance LLMs' impact on math education, we need data on their mathematical prowess and biases across prompts. To fill this gap, we introduce MEDS (Math Education Digital Shadows) as a dataset mappi…

Read Paper →

Browse Research Papers

From Stochastic to Deterministic: A Multi-Criteria Decision Analysis Framework for Bounded Semantic Parsing in AI-Driven Recruitment Screening

LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis

What Makes a Good Terminal-Agent Benchmark Task: A Guideline for Adversarial, Difficult, and Legible Evaluation Design

To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems

Shuffling-Aware Optimization for Private Vector Mean Estimation

Models Recall What They Violate: Constraint Adherence in Multi-Turn LLM Ideation

From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation

Training-Free Tunnel Defect Inspection and Engineering Interpretation via Visual Recalibration and Entity Reconstruction

Geometry-Calibrated Conformal Abstention for Language Models

Simulating clinical interventions with a generative multimodal model of human physiology

An Empirical Evaluation of Code Smell Detection in Angular Applications

In-Context Prompting Obsoletes Agent Orchestration for Procedural Tasks

KellyBench: A Benchmark for Long-Horizon Sequential Decision Making

RuC: HDL-Agnostic Rule Completion Benchmark Generation

LLM-as-a-Judge for Human-AI Co-Creation: A Reliability-Aware Evaluation Framework for Coding

Knowledge Graph Representations for LLM-Based Policy Compliance Reasoning

One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness

HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs

WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning

Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs

Browse by Category

Research Type

Publish Your Research