256+ open-access research outputs.
We investigate Permutation-Invariant (PI) quantum error-correcting codes encoding a logical qudit of dimension $\mathrm{d}_\mathrm{L}$ in PI states using physical qudits of dimension $\mathrm{d}_\math…
Foundation object detectors such as GLIP and Grounding DINO excel on general-domain data but often degrade in specialized and data-scarce settings like underwater imagery or industrial defects. Typica…
Advertising image generation has increasingly focused on online metrics like Click-Through Rate (CTR), yet existing approaches adopt a ``one-size-fits-all" strategy that optimizes for overall CTR whil…
Learning the dependence structure among variables in complex systems is a central problem across medical, natural, and social sciences. These structures can be naturally represented by graphs, and the…
We establish power saving asymptotics for the sum of the divisor function along a binary quartic form, improving on work of Daniel. The proof involves an application of a recent two dimensional delta …
Multi-domain image-to-image translation re quires grounding semantic differences ex pressed in natural language prompts into corresponding visual transformations, while preserving unrelated structural…
The polynomial $\sum_{\pi \in W}q^{maj(\pi)}$ of major index over a classical Weyl group $W$ with a generating set $S$ is called the Mahonian polynomial over $W$, and also the polynomial $\sum_{\pi \i…
This paper introduces a cutting-edge approach to cross-modal interaction for tiny object detection by combining semantic-guided natural language processing with advanced visual recognition backbones. …
Future electron-positron ($\ee$) colliders, operating as Higgs factories or Z factories, promise unprecedented precision electroweak measurements that are vital to testing the Standard Model (SM) and …
We continue the study of Adin, Alon and Roichman [arXiv:2502.14398, 2025] on the number of steps required to sort $n$ labelled points on a circle by transpositions. Imagine that the vertices of a cycl…
Visual speech recognition (VSR), also known as lip reading, is the task of recognizing speech from silent video. Despite significant advancements in VSR over recent decades, most existing methods pay …
As humanoid robots enter real-world environments, ensuring robust locomotion across diverse environments is crucial. This paper presents a computationally efficient hierarchical control framework for …
Medical image grounding aims to align natural language phrases with specific regions in medical images, serving as a foundational task for intelligent diagnosis, visual question answering (VQA), and a…
Digital signatures are fundamental cryptographic tools that provide authentication and integrity in digital communications. However, privacy-sensitive applications, such as e-voting and digital cash, …
Recent advances in 3D neural representations and instance-level editing models have enabled the efficient creation of high-quality 3D content. However, achieving precise local 3D edits remains challen…
In this paper we present Chaoticus, a Python-based package for the GPU-accelerated integration of ODE systems and the computation of chaos indicators, including SALI, GALI, Lagrangian Descriptors base…
Motion capture using sparse inertial sensors has shown great promise due to its portability and lack of occlusion issues compared to camera-based tracking. Existing approaches typically assume that IM…
Vision-Language Navigation (VLN) enables intelligent agents to navigate environments by integrating visual perception and natural language instructions, yet faces significant challenges due to the sca…
Multimodal reference resolution, including phrase grounding, aims to understand the semantic relations between mentions and real-world objects. Phrase grounding between images and their captions is a …
Out-of-distribution (OOD) detection is critical for ensuring the safety and reliability of machine learning systems, particularly in dynamic and open-world environments. In the vision and text domains…
Free open-access publishing with Google Scholar indexing.
Submission Guide →