Mario Velez — Research Repository

Computer Science Preprint PDF DOI

NuggetIndex: Governed Atomic Retrieval for Maintainable RAG

Saber Zerhoudi, Michael Granitzer, Jelena Mitrovic · 2026

Retrieval-augmented generation (RAG) systems are frequently evaluated via fact-based metrics, yet standard implementations retrieve passages or static propositions. This unit mismatch between evaluati…

Read Paper →

Physics Preprint PDF DOI

On matrix Lax representations for (1+1)-dimensional evolutionary differential-difference equations

Sergei Igonin · 2026

Differential-difference matrix Lax representations (Lax pairs), gauge transformations, and discrete Miura-type transformations (MTs) belong to the main tools in the theory of (nonlinear) integrable di…

Read Paper →

AI & Data Science Preprint PDF DOI

Marco-MoE: Open Multilingual Mixture-of-Expert Language Models with Efficient Upcycling

Fan Jiang, Yu Zhao, Chenyang Lyu, Tianqi Shi, Yichao Du, Feihu Jiang, Longyue Wang, Weihua Luo · 2026

We present Marco-MoE, a suite of fully open multilingual sparse Mixture-of-Experts (MoE) models. Marco-MoE features a highly sparse design in which only around 5\% of the total parameters are activate…

Read Paper →

Sociology & Anthropology Preprint PDF DOI

Estimating the cascading global impacts of gas disruptions in Qatar

Ritwick Mishra, Diksha Gupta, Achla Marathe, Krista Danielle Yu, Aaron Schroeder, Samarth Swarup, Brian Klahn, Phil Potter, Madhav Marathe, Anil Vullikanti · 2026

This study examines the global impacts of a localized disruption in Qatar's gas sector using a multi-regional input-output framework and scenario-based analysis. While the direct impacts of this disru…

Read Paper →

Computer Science Preprint PDF DOI

Lost in Decoding? Reproducing and Stress-Testing the Look-Ahead Prior in Generative Retrieval

Kidist Amde Mekonnen, Yongkang Li, Yubao Tang, Simon Lupart, Maarten de Rijke · 2026

Generative retrieval (GR) ranks documents by autoregressively generating document identifiers. Because many GR methods rely on trie-constrained beam search, they are vulnerable to early pruning of rel…

Read Paper →

Computer Science Preprint PDF DOI

A Parametric Memory Head for Continual Generative Retrieval

Kidist Amde Mekonnen, Yubao Tang, Maarten de Rijke · 2026

Generative information retrieval (GenIR) consolidates retrieval into a single neural model that decodes document identifiers (docids) directly from queries. While this model-as-index paradigm offers a…

Read Paper →

Computer Science Preprint PDF DOI

Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA

Teng Chen, Sheng Xu, Feixiang Guo, Xiaoyu Wang, Qingqing Gu, Hongyan Li, Luo Ji · 2026

Unlike traditional fact-based retrieval, rationale-based retrieval typically necessitates cross-encoding of query-document pairs using large language models, incurring substantial computational costs.…

Read Paper →

AI & Data Science Preprint PDF DOI

MARCO: Navigating the Unseen Space of Semantic Correspondence

Claudia Cuttano, Gabriele Trivigno, Carlo Masone, Stefan Roth · 2026

Recent advances in semantic correspondence rely on dual-encoder architectures, combining DINOv2 with diffusion backbones. While accurate, these billion-parameter models generalize poorly beyond traini…

Read Paper →

AI & Data Science Preprint PDF DOI

QuickScope: Certifying Hard Questions in Dynamic LLM Benchmarks

Taylor Lundy, Narun K. Raman, Kevin Leyton-Brown · 2026

LLM benchmarks are increasingly dynamic: instead of containing a fixed set of questions, they define templates and parameters that can generate an effectively unlimited number of question variants. Th…

Read Paper →

Engineering Preprint PDF DOI

Tree Learning: A Multi-Skill Continual Learning Framework for Humanoid Robots

Yifei Yan, Linqi Ye · 2026

As reinforcement learning for humanoid robots evolves from single-task to multi-skill paradigms, efficiently expanding new skills while avoiding catastrophic forgetting has become a key challenge in e…

Read Paper →

Computer Science Preprint PDF DOI

Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions

Utshab Kumar Ghosh, Ashish David, Shubham Chatterjee · 2026

Reproducibility must validate architectural robustness, not just numerical accuracy. We evaluate ColBERT-v2 and ConstBERT across five dimensions, finding that while ConstBERT reproduces within 0.05% M…

Read Paper →

Physics Preprint PDF DOI

Strictly correlated electrons in a quantum ring: from Kohn-Sham to Kantorovich potentials

Thiago Carvalho Corso · 2026

Our goal in this paper is twofold. First, we characterize the class of pairwise interactions for which the Seidl conjecture on the structure of optimal plans for the symmetric multimarginal optimal tr…

Read Paper →

Mathematics Preprint PDF DOI

Determinantally Equivalent Functions Beyond the Nowhere-Zero Case

Harry Sapranidis Mantelos · 2026

Let $\Lambda$ be a set and $\mathbb{F}$ a field, and suppose that $K,Q:\Lambda^2\to\mathbb{F}$ are two functions such that for any $n\in\mathbb{N}$ and $x_1,x_2,\ldots,x_n\in\Lambda$, the determinants…

Read Paper →

Computer Science Preprint PDF DOI

FGR-ColBERT: Identifying Fine-Grained Relevance Tokens During Retrieval

Antonin Jarolim, Martin Fajcik · 2026

Document retrieval identifies relevant documents but does not provide fine-grained evidence cues, such as specific relevant spans. A possible solution is to apply an LLM after retrieval; however, this…

Read Paper →

AI & Data Science Preprint PDF DOI

Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

Bin Zhu, Qianghuai Jia, Tian Lan, Junyang Ren, Feng Gu, Feihu Jiang, Longyue Wang, Zhao Xu, Weihua Luo · 2026

Deep research agents autonomously conduct open-ended investigations, integrating complex information retrieval with multi-step reasoning across diverse sources to solve real-world problems. To sustain…

Read Paper →

Computer Science Preprint PDF DOI

ColBERT-Att: Late-Interaction Meets Attention for Enhanced Retrieval

Raj Nath Patel, Sourav Dutta · 2026

Vector embeddings from pre-trained language models form a core component in Neural Information Retrieval systems across a multitude of knowledge extraction tasks. The paradigm of late interaction, int…

Read Paper →

Computer Science Preprint PDF DOI

PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

Haozhen Wang, Haoyue Liu, Jionghao Zhu, Zhichao Wang, Yongxin Guo, Xiaoying Tang · 2026

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of applications. However, their practical deployment is often hindered by issues such as outdated knowledge an…

Read Paper →

Mathematics Preprint PDF DOI

Gromov-Witten invariants and membrane indices of fivefolds via the topological vertex

Yannik Schuler · 2026

We conjecture the existence of almost integer invariants governing the all-genus equivariant Gromov-Witten theory of Calabi-Yau fivefolds with a torus action. We prove the conjecture for skeletal, loc…

Read Paper →

Computer Science Preprint PDF DOI

From Questions to Trust Reports: A LLM-IR Framework for the TREC 2025 DRAGUN Track

Ignacy Alwasiak, Kene Nnolim, Jaclyn Thi, Samy Ateia, Markus Bink, Gregor Donabauer, David Elsweiler, Udo Kruschwitz · 2026

The DRAGUN Track at TREC 2025 targets the growing need for effective support tools that help users evaluate the trustworthiness of online news. We describe the UR_Trecking system submitted for both Ta…

Read Paper →

Mathematics Preprint PDF DOI

Coherent RFRS groups

Sam P. Fisher, Marco Linton, Pablo Sanchez-Peralta · 2026

We prove that a finitely generated virtually RFRS group of cohomological dimension at most $2$ is coherent if and only if its second $L^{2}$-Betti number vanishes if and only if it is virtually free-b…

Read Paper →

Browse Research Papers

NuggetIndex: Governed Atomic Retrieval for Maintainable RAG

On matrix Lax representations for (1+1)-dimensional evolutionary differential-difference equations

Marco-MoE: Open Multilingual Mixture-of-Expert Language Models with Efficient Upcycling

Estimating the cascading global impacts of gas disruptions in Qatar

Lost in Decoding? Reproducing and Stress-Testing the Look-Ahead Prior in Generative Retrieval

A Parametric Memory Head for Continual Generative Retrieval

Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA

MARCO: Navigating the Unseen Space of Semantic Correspondence

QuickScope: Certifying Hard Questions in Dynamic LLM Benchmarks

Tree Learning: A Multi-Skill Continual Learning Framework for Humanoid Robots

Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions

Strictly correlated electrons in a quantum ring: from Kohn-Sham to Kantorovich potentials

Determinantally Equivalent Functions Beyond the Nowhere-Zero Case

FGR-ColBERT: Identifying Fine-Grained Relevance Tokens During Retrieval

Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

ColBERT-Att: Late-Interaction Meets Attention for Enhanced Retrieval

PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

Gromov-Witten invariants and membrane indices of fivefolds via the topological vertex

From Questions to Trust Reports: A LLM-IR Framework for the TREC 2025 DRAGUN Track

Coherent RFRS groups

Browse by Category

Research Type

Publish Your Research