Expertini Research Research

Browse Research Papers

29,518+ open-access research outputs.

โœ• Clear
๐Ÿ” scott anthony sisson
Showing 29518 results for "scott anthony sisson"
Computer Science Preprint PDF DOI

Essential, Yet Overlooked: Identity Verification Barriers for Blind and Low Vision People in Government Services

Ryan John Oommen, Tanusree Sharma ยท 2026

Identity verification is a critical gateway to accessing government services and public benefits, yet contemporary systems are typically designed around visual interaction, leaving blind and low visioโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces

Andrew Bond, Ilkin Umut Melanlioglu, Erkut Erdem, Aykut Erdem ยท 2026

Modern visual world modeling systems increasingly rely on high-capacity architectures and large-scale data to produce plausible motion, yet they often fail to preserve underlying 3D geometry or physicโ€ฆ

Read Paper โ†’
Mathematics Preprint PDF DOI

Gauge symmetry and uniqueness in inverse problems for the JMGT equation

Dong Qiu, Xiang Xu, Yeqiong Ye, Ting Zhou ยท 2026

In this paper, we study an inverse boundary value problem for the Jordan--Moore--Gibson--Thompson equation on a simple Riemannian manifold. We consider an all boundary measurement map that maps Dirichโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

TransVLM: A Vision-Language Framework and Benchmark for Detecting Any Shot Transitions

Ce Chen, Yi Ren, Yuanming Li, Viktor Goriachko, Zhenhui Ye, Zujin Guo, Zhibin Hong, Mingming Gong ยท 2026

Traditional Shot Boundary Detection (SBD) inherently struggles with complex transitions by formulating the task around isolated cut points, frequently yielding corrupted video shots. We address this fโ€ฆ

Read Paper โ†’
Computer Science Preprint PDF DOI

From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation

Guang Yang, Xing Hu, Xiang Chen, Xin Xi ยท 2026

Multimodal large language models (MLLMs) are increasingly used to translate visual artifacts into code, from UI mockups into HTML to scientific plots into Python scripts. A circuit diagram can be viewโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

The Effects of Visual Priming on Cooperative Behavior in Vision-Language Models

Kenneth J. K. Ong ยท 2026

As Vision-Language Models (VLMs) become increasingly integrated into decision-making systems, it is essential to understand how visual inputs influence their behavior. This paper investigates the effeโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

Dynamic Cluster Data Sampling for Efficient and Long-Tail-Aware Vision-Language Pre-training

Mingliang Liang, Zhuoran Liu, Arjen P. de Vries, Martha Larson ยท 2026

The computational cost of training a vision-language model (VLM) can be reduced by sampling the training data. Previous work on efficient VLM pre-training has pointed to the importance of semantic datโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

Focus Session: Autonomous Systems Dependability in the era of AI: Design Challenges in Safety, Security, Reliability and Certification

Behnaz Ranjbar, Kirankumar Raveendiran, Sudeep Pasricha, Samarjit Chakraborty, Cecilia Carbonelli, Akash Kumar ยท 2026

The design of embedded safety-critical systems such as those used in next-generation automotive and autonomous platforms, is increasingly challenged by escalating system complexity, hardware-software โ€ฆ

Read Paper โ†’
Physics Preprint PDF DOI

Macroscopic photon counting beating the Poisson noise limit

Timon Schapeler, Fabian Schlue, Isabell Mischke, Michael Stefszky, Benjamin Brecht, Christine Silberhorn, Tim J. Bartley ยท 2026

Photon counting is a cornerstone of quantum optics. Here, we demonstrate precisely counting from 0 to over 9000 photons, beating the Poisson noise limit by at least $4.1~\mathrm{dB}$ across this rangeโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

Auditing Frontier Vision-Language Models for Trustworthy Medical VQA: Grounding Failures, Format Collapse, and Domain Adaptation

Xupeng Chen, Binbin Shi, Chenqian Le, Qifu Yin, Lang Lin, Haowei Ni, Ran Gong, Panfeng Li ยท 2026

Deploying vision-language models (VLMs) in clinical settings demands auditable behavior under realistic failure conditions, yet the failure landscape of frontier VLMs on specialized medical inputs is โ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining

Hyeonseo Jang, Jaebyeong Jeon, Joong-Won Hwang, Kibok Lee ยท 2026

Test-time prompt tuning (TPT) has emerged as a promising technique for enhancing the adaptability of vision-language models by optimizing textual prompts using unlabeled test data. However, prior studโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation

Pengna Li, Kangyi Wu, Shaoqing Xu, Fang Li, Hanbing Li, Lin Zhao, Kailin Lyu, Long Chen, Zhi-Xin Yang, Nanning Zheng ยท 2026

Vision-and-Language Navigation (VLN) aims to enable an embodied agent to follow natural-language instructions and navigate to a target location in unseen 3D environments. We argue that adapting VLMs tโ€ฆ

Read Paper โ†’
Mathematics Preprint PDF DOI

Discontinuous Galerkin IMEX Pressure Correction Scheme for the Poisson-Nernst-Planck-Navier-Stokes Equations

Bikram Bir, Amiya K. Pani ยท 2026

Based on a discontinuous Galerkin method in the spatial directions and an improved implicit-explicit pressure-correction scheme in the temporal direction, this paper discusses a fully discrete scheme โ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

EdgeFM: Efficient Edge Inference for Vision-Language Models

Mengling Deng, Yuanpeng Chen, Sheng Yang, Wei Tao, Wenhai Zhang, Hui Song, Linyuanhao Qin, Kai Zhao, Xiaojun Ye, Shanhui Mo, Jingli Fan, Shuang Zhang, Bei Liu, Tiankun Zhao, Xiangjing An ยท 2026

Vision-language models (VLMs) have demonstrated strong applicability in edge industrial applications, yet their deployment remains severely constrained by requirements for deterministic low latency anโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis

David Fernandez, Pedram MohajerAnsari, Amir Salarpour, Mert D. Pese ยท 2026

Vision-language models (VLMs) are increasingly used in autonomous driving because they combine visual perception with language-based reasoning, supporting more interpretable decision-making, yet theirโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

Judge, Then Drive: A Critic-Centric Vision Language Action Framework for Autonomous Driving

Lijin Yang, Jianing Huang, Zhongzhan Huang, Shu Liu, Hao Yang ยท 2026

Recent advances in vision language action (VLA) models have shown remarkable potential for autonomous driving by directly mapping multimodal inputs to control signals. However, previous VLA-based methโ€ฆ

Read Paper โ†’
Physics Preprint PDF DOI

Hindered Prompt-Neutron Evaporation in Surrogate Reactions for $^{239}$Pu(n,f)

D. Ramos, M. Caamano, F. Farget, C. Rodriguez-Tajes, A. Lemasson, M. Rejmund, C. Schmitt, E. Clement, O. Litaize, O. Serot, L. Audouin, J. Benlliure, E. Casarejos, D. Cortina, D. Dore, B. Fernandez-Dominguez, G. de France, A. Heinz, B. Jacquot, C. Paradela, T. Roger ยท 2026

Isotopic fission-fragment distributions of $^{240}$Pu have been measured, for the first time, as a function of the initial excitation energy, and the prompt neutron multiplicity has been derived from โ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

Three-Step Nav: A Hierarchical Global-Local Planner for Zero-Shot Vision-and-Language Navigation

Wanrong Zheng, Yunhao Ge, Laurent Itti ยท 2026

Breakthrough progress in vision-based navigation through unknown environments has been achieved by using multimodal large language models (MLLMs). These models can plan a sequence of motions by evaluaโ€ฆ

Read Paper โ†’
AI & Data Science Preprint PDF DOI

TAP into the Patch Tokens: Leveraging Vision Foundation Model Features for AI-Generated Image Detection

Ahmed Abdullah, Nikolas Ebert, Oliver Wasenmuller ยท 2026

Recent methods demonstrate that large-scale pretrained models, such as CLIP vision transformers, effectively detect AI-generated images (AIGIs) from unseen generative models when used as feature extraโ€ฆ

Read Paper โ†’
Computer Science Preprint PDF DOI

Distributed Multi-View Vision-Only RSSI Estimation

Jung-Beom Kim, Woongsup Lee ยท 2026

Received Signal Strength Indicator (RSSI) estimation is essential for wireless link management, yet conventional feedback-based approaches incur uplink overhead, suffer from measurement instability, aโ€ฆ

Read Paper โ†’
Page 1 of 1476 Next โ†’