Junming Zhao — Research Repository

AI & Data Science Preprint PDF DOI

Exploration Hacking: Can LLMs Learn to Resist RL Training?

Eyon Jang, Damon Falck, Joschka Braun, Nathalie Kirch, Achu Menon, Perusha Moodley, Scott Emmons, Roland S. Zimmermann, David Lindner · 2026

Reinforcement learning (RL) has become essential to the post-training of large language models (LLMs) for reasoning, agentic capabilities and alignment. Successful RL relies on sufficient exploration …

Read Paper →

AI & Data Science Preprint PDF DOI

PhyCo: Learning Controllable Physical Priors for Generative Motion

Sriram Narayanan, Ziyu Jiang, Srinivasa Narasimhan, Manmohan Chandraker · 2026

Modern video diffusion models excel at appearance synthesis but still struggle with physical consistency: objects drift, collisions lack realistic rebound, and material responses seldom match their un…

Read Paper →

Engineering Preprint PDF DOI

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

Binghao Huang, Yunzhu Li · 2026

We present FlexiTac, a low-cost, open-source, and scalable piezoresistive tactile sensing solution designed for robotic end-effectors. FlexiTac is a practical "plug-in" module consisting of (i) thin, …

Read Paper →

Physics Preprint PDF DOI

Optimal current-based sensing of phonon temperature using a finite reservoir

Sindre Brattegard, Stephanie Matern, Mark T. Mitchison, Saulo V. Moreira · 2026

In realistic nanoscale transport set-ups, electron-phonon coupling leads to the exchange of heat between phonon baths and electronic reservoirs with finite heat capacities. Such exchange affects the f…

Read Paper →

Computer Science Preprint PDF DOI

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Chenxin Li, Zhengyang Tang, Huangxin Lin, Yunlong Lin, Shijue Huang, Shengyuan Liu, Bowen Ye, Rang Li, Lei Li, Benyou Wang, Yixuan Yuan · 2026

LLM agents are expected to complete end-to-end units of work across software tools, business services, and local workspaces. Yet many agent benchmarks freeze a curated task set at release time and gra…

Read Paper →

AI & Data Science Preprint PDF DOI

PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning

Sudong Wang, Weiquan Huang, Xiaomin Yu, Zuhao Yang, Hehai Lin, Keming Wu, Chaojun Xiao, Chen Chen, Wenxuan Wang, Beier Zhu, Yunjian Zhang, Chengwei Qin · 2026

The standard post-training recipe for large multimodal models (LMMs) applies supervised fine-tuning (SFT) on curated demonstrations followed by reinforcement learning with verifiable rewards (RLVR). H…

Read Paper →

Engineering Preprint PDF DOI

Intelligent Self-tuning Active EMI Filtering for Electrified Automotive Power Systems Using Reinforcement Learning

Mahuizi Lu, Kelin Jia, Rajib Goswami, Yukun Hu · 2026

The rapid electrification and intelligence of modern transportation systems place stringent demands on the electromagnetic compatibility, reliability, and adaptability of automotive power electronics.…

Read Paper →

AI & Data Science Preprint PDF DOI

Characterizing the Consistency of the Emergent Misalignment Persona

Anietta Weckauff, Yuchen Zhang, Maksym Andriushchenko · 2026

Fine-tuning large language models (LLMs) on narrowly misaligned data generalizes to broadly misaligned behavior, a phenomenon termed emergent misalignment (EM). While prior work has found a correlatio…

Read Paper →

Physics Preprint PDF DOI

A nanoionic diode: Equilibrium rectifying junction enabling large and stable resistance variations

Chuanlian Xiao, Joachim Maier · 2026

We report on a new type of rectifier which is in full contact equilibrium and thus, if down-sized to the nanoscale, shows no drift even if exposed to elevated temperatures and/or extreme waiting times…

Read Paper →

AI & Data Science Preprint PDF DOI

Dynamic Scaled Gradient Descent for Stable Fine-Tuning for Classifications

Nghia Bui, Lijing Wang · 2026

Fine-tuning pretrained models has become a standard approach to adapting pretrained knowledge to improve the accuracy on new sparse, imbalance datasets. However, issues arise when optimization falls i…

Read Paper →

AI & Data Science Preprint PDF DOI

ITS-Mina: A Harris Hawks Optimization-Based All-MLP Framework with Iterative Refinement and External Attention for Multivariate Time Series Forecasting

Pourya Zamanvaziri, Amirhossein Sadr, Aida Pakniyat, Dara Rahmati · 2026

Multivariate time series forecasting plays a pivotal role in numerous real-world applications, including financial analysis, energy management, and traffic planning. While Transformer-based architectu…

Read Paper →

AI & Data Science Preprint PDF DOI

Language Models Refine Mechanical Linkage Designs Through Symbolic Reflection and Modular Optimisation

Joao Pedro Gandarela, Thiago Rios, Stefan Menzel, Andre Freitas · 2026

Designing mechanical linkages involves combinatorial topology selection and continuous parameter fitting. We show that language models can systematically improve linkage designs through symbolic repre…

Read Paper →

AI & Data Science Preprint PDF DOI

GUI Agents with Reinforcement Learning: Toward Digital Inhabitants

Junan Hu, Jian Liu, Jingxiang Lai, Jiarui Hu, Yiwei Sheng, Shuang Chen, Jian Li, Dazhao Du, Song Guo · 2026

Graphical User Interface (GUI) agents have emerged as a promising paradigm for intelligent systems that perceive and interact with graphical interfaces visually. Yet supervised fine-tuning alone canno…

Read Paper →

AI & Data Science Preprint PDF DOI

Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

Sihong Wu, Owen Jiang, Yilun Zhao, Tiansheng Hu, Yiling Ma, Kaiyan Zhang, Manasi Patwardhan, Arman Cohan · 2026

Peer review is a multi-stage process involving reviews, rebuttals, meta-reviews, final decisions, and subsequent manuscript revisions. Recent advances in large language models (LLMs) have motivated me…

Read Paper →

AI & Data Science Preprint PDF DOI

Generate Your Talking Avatar from Video Reference

Zujin Guo, Zhenhui Ye, Yi Ren, Yuanming Li, Ce Chen, Zhibin Hong, Chen Change Loy · 2026

Existing talking avatar methods typically adopt an image-to-video pipeline conditioned on a static reference image within the same scene as the target generation. This restricted, single-view perspect…

Read Paper →

Computer Science Preprint PDF DOI

NetSatBench: A Distributed LEO Constellation Emulator with an SRv6 Case Study

Andrea Detti, Shahram Dadras, Giuseppe Tropea · 2026

NetSatBench is a distributed emulation platform for evaluating communication protocols and application workloads over large-scale LEO satellite systems. Satellites, gateways, and user terminals are im…

Read Paper →

AI & Data Science Preprint PDF DOI

CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting

Bokai Pan, Mingyue Cheng, Zhiding Liu, Shuo Yu, Xiaoyu Tao, Yuchong Wu, Qi Liu, Defu Lian, Enhong Chen · 2026

Recently, large language models (LLMs) have shown great promise in time series forecasting. However, most existing LLM-based forecasting methods still follow a static generative paradigm that directly…

Read Paper →

AI & Data Science Preprint PDF DOI

Taming Noise-Induced Prototype Degradation for Privacy-Preserving Personalized Federated Fine-Tuning

Yuhua Wang, Qinnan Zhang, Xiaodong Li, Huan Zhang, Yifan Sun, Wangjie Qiu, Hainan Zhang, Yongxin Tong, Zhiming Zheng · 2026

Prototype-based Personalized Federated Learning (ProtoPFL) enables efficient multi-domain adaptation by communicating compact class prototypes, but directly sharing them poses privacy risks. A common …

Read Paper →

Computer Science Preprint PDF DOI

WOOTdroid: Whole-system Online On-device Tracing for Android

Simon Althaus, Nikolaos Alexopoulos, Max Muhlhauser, Christian Reuter, Ephraim Zimmer · 2026

System auditing on Android faces two problems. First, existing syscall tracers lose events under load, silently overwriting entries faster than a user space reader can drain them. Second, security-rel…

Read Paper →

Engineering Preprint PDF DOI

Learning-Based Hierarchical Scene Graph Matching for Robot Localization Leveraging Prior Maps

Nimrod Millenium Ndulue, Jose Andres Millan-Romera, Matteo Giorgi, Holger Voos, Jose Luis Sanchez-Lopez · 2026

Accurate localization is a fundamental requirement for autonomous robots operating in indoor environments. Scene graphs encode the spatial structure of an environment as a hierarchy of semantic entiti…

Read Paper →

Browse Research Papers

Exploration Hacking: Can LLMs Learn to Resist RL Training?

PhyCo: Learning Controllable Physical Priors for Generative Motion

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

Optimal current-based sensing of phonon temperature using a finite reservoir

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning

Intelligent Self-tuning Active EMI Filtering for Electrified Automotive Power Systems Using Reinforcement Learning

Characterizing the Consistency of the Emergent Misalignment Persona

A nanoionic diode: Equilibrium rectifying junction enabling large and stable resistance variations

Dynamic Scaled Gradient Descent for Stable Fine-Tuning for Classifications

ITS-Mina: A Harris Hawks Optimization-Based All-MLP Framework with Iterative Refinement and External Attention for Multivariate Time Series Forecasting

Language Models Refine Mechanical Linkage Designs Through Symbolic Reflection and Modular Optimisation

GUI Agents with Reinforcement Learning: Toward Digital Inhabitants

Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

Generate Your Talking Avatar from Video Reference

NetSatBench: A Distributed LEO Constellation Emulator with an SRv6 Case Study

CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting

Taming Noise-Induced Prototype Degradation for Privacy-Preserving Personalized Federated Fine-Tuning

WOOTdroid: Whole-system Online On-device Tracing for Android

Learning-Based Hierarchical Scene Graph Matching for Robot Localization Leveraging Prior Maps

Browse by Category

Research Type

Publish Your Research