Mete Ozay — Research Repository

Computer Science Preprint PDF DOI

Towards Neuro-symbolic Causal Rule Synthesis, Verification, and Evaluation Grounded in Legal and Safety Principles

Zainab Rehan, Christian Medeiros Adriano, Sona Ghahremani, Holger Giese · 2026

Rule-based systems remain central in safety-critical domains but often struggle with scalability, brittleness, and goal misspecification. These limitations can lead to reward hacking and failures in f…

Read Paper →

AI & Data Science Preprint PDF DOI

TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering

An-Yang Ji, Jun-Peng Jiang, De-Chuan Zhan, Han-Jia Ye · 2026

Large Language Models (LLMs) have advanced Table Question Answering, where most queries can be answered by extracting information or simple aggregation. However, a common class of real-world queries i…

Read Paper →

AI & Data Science Preprint PDF DOI

Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

Sihong Wu, Owen Jiang, Yilun Zhao, Tiansheng Hu, Yiling Ma, Kaiyan Zhang, Manasi Patwardhan, Arman Cohan · 2026

Peer review is a multi-stage process involving reviews, rebuttals, meta-reviews, final decisions, and subsequent manuscript revisions. Recent advances in large language models (LLMs) have motivated me…

Read Paper →

AI & Data Science Preprint PDF DOI

Meta-Analysis Without Normality: Estimating the True Effect Distribution with Penalized Gaussian Mixtures

Daihe Sui, Elizabeth Tipton · 2026

Standard random-effects meta-analysis relies heavily on the assumption that the underlying true effects are normally distributed. In the social sciences, where evidence synthesis increasingly involves…

Read Paper →

AI & Data Science Preprint PDF DOI

Rethinking Agentic Reinforcement Learning In Large Language Models

Fangming Cui, Ruixiao Zhu, Cheng Fang, Sunan Li, Jiahong Li · 2026

Reinforcement Learning (RL) has traditionally focused on training specialized agents to optimize predefined reward functions within narrowly defined environments. However, the advent of powerful Large…

Read Paper →

Computer Science Preprint PDF DOI

Test Before You Deploy: Governing Updates in the LLM Supply Chain

Mohd Sameen Chishti, Damilare Peter Oyinloye, Jingyue Li · 2026

Large Language Models (LLMs) are increasingly used as core dependencies in software systems. However, the hosted LLM services evolve continuously through provider-side updates without explicit version…

Read Paper →

Physics Preprint PDF DOI

On the Difference Between Pulsar Radio Emission Beams from the Two Poles

Xiancong Wu, Hongguang Wang, Hao Tong, Rui Luo, Pengfei Wang, Chengbing Lyu, Hai Lei · 2026

The long-standing assumption of symmetric radio emission beams from the two magnetic poles of pulsars is challenged by observational evidence of asymmetry and underfill. Direct testing of this symmetr…

Read Paper →

Mathematics Preprint PDF DOI

Pancyclicity in Graph Families with the Ore-Type Condition

Luyi Li, Yubo Wang, Guiying Yan · 2026

Let $ n \in \mathbb{N} $ with $ n \geq 3 $, and let $\mathcal{G} = \{G_i:i\in [n]\} $ be a family of $ n $-vertex graphs on a common vertex set $V$, where the graphs in the family do not need to be di…

Read Paper →

AI & Data Science Preprint PDF DOI

Improving Graph Few-shot Learning with Hyperbolic Space and Denoising Diffusion

Yonghao Liu, Jialu Sun, Wei Pang, Fausto Giunchiglia, Ximing Li, Xiaoyue Feng, Renchu Guan · 2026

Graph few-shot learning, which focuses on effectively learning from only a small number of labeled nodes to quickly adapt to new tasks, has garnered significant research attention. Despite recent adva…

Read Paper →

AI & Data Science Preprint PDF DOI

Robust inference methods of diagnostic test accuracy meta-analysis for influential outlying studies via density power divergence

Kotaro Sasaki, Hisashi Noma, Theodoros Evrenoglou · 2026

In diagnostic test accuracy meta-analysis (DTA-MA), standard inference methods using bivariate random-effects models for jointly synthesizing sensitivity and specificity can be sensitive to outlying s…

Read Paper →

AI & Data Science Preprint PDF DOI

Bayesian X-Learner: Calibrated Posterior Inference for Heterogeneous Treatment Effects under Heavy-Tailed Outcomes

Eichi Uehara · 2026

Conditional Average Treatment Effect (CATE) estimation in practice demands three properties simultaneously: heterogeneous effects $\tau(x)$, calibrated uncertainty over them, and robustness to the hea…

Read Paper →

AI & Data Science Preprint PDF DOI

Safe Bilevel Delegation (SBD): A Formal Framework for Runtime Delegation Safety in Multi-Agent Systems

Yuan Sun · 2026

As large language model (LLM) agents are deployed in high-stakes environments, the question of how safely to delegate subtasks to specialized sub-agents becomes critical. Existing work addresses multi…

Read Paper →

Computer Science Preprint PDF DOI

Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations

Md Hasan Saju, Akramul Azim · 2026

Security Operations Centers (SOCs) face mounting operational challenges. These challenges come from increasing threat volumes, heterogeneous SIEM platforms, and time-consuming manual triage workflows.…

Read Paper →

AI & Data Science Preprint PDF DOI

Student Classroom Behavior Recognition Based on Improved YOLOv8s

Xiang Gao, Shuai Hang · 2026

In classroom teaching, student behavior can reflect their learning state and classroom participation, which is of great significance for teaching quality analysis. To address the problems of dense stu…

Read Paper →

AI & Data Science Preprint PDF DOI

Mechanized Foundations of Structural Governance: Machine-Checked Proofs for Governed Intelligence

Alan L. McCann · 2026

We present five results in the theory of structural governance for cognitive workflow systems. Three are mechanized in Coq 8.19 using the Interaction Trees library with parameterized coinduction; two …

Read Paper →

AI & Data Science Preprint PDF DOI

When 2D Tasks Meet 1D Serialization: On Serialization Friction in Structured Tasks

Chung-Hsiang Lo, Lu Li, Diji Yang, Tianyu Zhang, Yunkai Zhang, Yoshua Bengio, Yi Zhang · 2026

Large language models (LLMs) conventionally process structured inputs as 1D token sequences. While natural for prose, such linearization may introduce additional representational burden for tasks whos…

Read Paper →

Computer Science Preprint PDF DOI

Upskilling with Generative AI: Practices and Challenges for Freelance Knowledge Workers

Kashif Imteyaz, Isabel Lopez, Nakul Rajpal, Hunjun Shin, Saiph Savage · 2026

Freelance workers must continually acquire new skills to remain competitive in online labor markets, yet they lack the organizational training, mentorship, and infrastructure available to traditional …

Read Paper →

Physics Preprint PDF DOI

First-Principles Thermodynamic Analysis of Ternary Chalcogenide Phase Change Materials

Felix Adams, Ichiro Takeuchi, Carlos Rios Ocampo, Yifei Mo · 2026

Chalcogenide phase-change materials (PCMs) are important for nonvolatile memory and reconfigurable photonic technologies. The GeTe-Sb2Te3 mixture system, commonly referred to as GST, is the most well-…

Read Paper →

Mathematics Preprint PDF DOI

Applied Random Matrix Theory

Joel A. Tropp · 2026

Random matrices now play a role in many parts of computational mathematics. To advance these applications, it is desirable to have tools that are flexible, easy to use, and powerful. Over the last 25 …

Read Paper →

AI & Data Science Preprint PDF DOI

End-to-end autonomous scientific discovery on a real optical platform

Shuxing Yang, Fujia Chen, Rui Zhao, Junyao Wu, Yize Wang, Haiyao Luo, Ning Han, Qiaolu Chen, Yuze Hu, Wenhao Li, Mingzhu Li, Hongsheng Chen, Yihao Yang · 2026

Scientific research has long been human-led, driving new knowledge and transformative technologies through the continual revision of questions, methods and claims as evidence accumulates. Although lar…

Read Paper →

Browse Research Papers

Towards Neuro-symbolic Causal Rule Synthesis, Verification, and Evaluation Grounded in Legal and Safety Principles

TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering

Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

Meta-Analysis Without Normality: Estimating the True Effect Distribution with Penalized Gaussian Mixtures

Rethinking Agentic Reinforcement Learning In Large Language Models

Test Before You Deploy: Governing Updates in the LLM Supply Chain

On the Difference Between Pulsar Radio Emission Beams from the Two Poles

Pancyclicity in Graph Families with the Ore-Type Condition

Improving Graph Few-shot Learning with Hyperbolic Space and Denoising Diffusion

Robust inference methods of diagnostic test accuracy meta-analysis for influential outlying studies via density power divergence

Bayesian X-Learner: Calibrated Posterior Inference for Heterogeneous Treatment Effects under Heavy-Tailed Outcomes

Safe Bilevel Delegation (SBD): A Formal Framework for Runtime Delegation Safety in Multi-Agent Systems

Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations

Student Classroom Behavior Recognition Based on Improved YOLOv8s

Mechanized Foundations of Structural Governance: Machine-Checked Proofs for Governed Intelligence

When 2D Tasks Meet 1D Serialization: On Serialization Friction in Structured Tasks

Upskilling with Generative AI: Practices and Challenges for Freelance Knowledge Workers

First-Principles Thermodynamic Analysis of Ternary Chalcogenide Phase Change Materials

Applied Random Matrix Theory

End-to-end autonomous scientific discovery on a real optical platform

Browse by Category

Research Type

Publish Your Research