Expertini Research Research

Browse Research Papers

339+ open-access research outputs.

โœ• Clear
๐Ÿ” keval vora ๐Ÿ“‚ Engineering
Showing 339 results for "keval vora" in Engineering
Engineering Preprint PDF DOI

BUT System Description for CHiME-9 MCoRec Challenge

Dominik Klement, Alexander Polok, Nguyen Hai Phong, Prachi Singh, Lukas Burget ยท 2026

Multi-talker automatic speech recognition (ASR) in conversational recordings remains an open problem, particularly in scenarios with large portion of overlapping speech where identifying and transcribโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Dual-LoRA: Parameter-Efficient Adversarial Disentanglement for Cross-Lingual Speaker Verification

Qituan Shangguan, Junhao Du, Kunyang Peng, Feng Xue, Hui Zhang, Xinsheng Wang, Kai Yu, Shuai Wang ยท 2026

Cross-lingual speaker verification suffers from severe language-speaker entanglement. This causes systematic degradation in the hardest scenario: correctly accepting utterances from the same speaker aโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Similarity Choice and Negative Scaling in Supervised Contrastive Learning for Deepfake Audio Detection

Jaskirat Sudan, Hashim Ali, Surya Subramani, Hafiz Malik ยท 2026

Supervised contrastive learning (SupCon) is widely used to shape representations, but has seen limited targeted study for audio deepfake detection. Existing work typically combines contrastive terms wโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

CRC-SAM: SAM-Based Multi-Modal Segmentation and Quantification of Colorectal Cancer in CT, Colonoscopy, and Histology Images

Daniel Lao ยท 2026

We present CRC-SAM, a unified framework for colorectal cancer segmentation across colonoscopy, CT, and histopathology images. Unlike prior single-modality methods, CRC-SAM provides consistent, modalitโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Embedding-Based Intrusive Evaluation Metrics for Musical Source Separation Using MERT Representations

Paul A. Bereuter, Alois Sontacchi ยท 2026

Evaluation of musical source separation (MSS) has traditionally relied on Blind Source Separation Evaluation (BSS-Eval) metrics. However, recent work suggests that BSS-Eval metrics exhibit low correlaโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

VLA Foundry: A Unified Framework for Training Vision-Language-Action Models

Jean Mercat, Sedrick Keh, Kushal Arora, Isabella Huang, Paarth Shah, Haruki Nishimura, Shun Iwase, Katherine Liu ยท 2026

We present VLA Foundry, an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. Most open-source VLA efforts specialize on the action training stage, often stitching togโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Dual-Radio BLE-LoRa Hierarchical Mesh for Infrastructure-Free Emergency Communication

Andrii Vakhnovskyi ยท 2026

We present a dual-radio hierarchical mesh architecture for infrastructure-free emergency communication that exploits the complementary strengths of Bluetooth Low Energy (BLE) and LoRa. Nodes equipped โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Noncoherent Maximum Likelihood Detection for LoRa Signals in Multipath Fading

The Khai Nguyen, Ebrahim Bedeer, Robert Barton ยท 2026

This letter derives the noncoherent (NC) maximum likelihood (ML) detection rule for LoRa signals under Rician multi-path fading channel. The proposed NC-ML detection only requires the channel statistiโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Rapid LoRA Aggregation for Wireless Channel Adaptation in Open-Set Radio Frequency Fingerprinting

Mingxi Zhang, Renjie Xie, Jincheng Wang, Guyue Li, Wei Xu ยท 2026

Radio frequency fingerprints (RFFs) enable secure wireless authentication but struggle in open-set scenarios with unknown devices and varying channels. Existing methods face challenges in generalizatiโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

X-VC: Zero-shot Streaming Voice Conversion in Codec Space

Qixi Zheng, Yuxiang Zhao, Tianrui Wang, Wenxi Chen, Kele Xu, Yikang Li, Qinyuan Chen, Xipeng Qiu, Kai Yu, Xie Chen ยท 2026

Zero-shot voice conversion (VC) aims to convert a source utterance into the voice of an unseen target speaker while preserving its linguistic content. Although recent systems have improved conversion โ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Graph-Enhanced LLM for SWAN-ISAC

Qian Gao, Ruikang Zhong, Yuanwei Liu ยท 2026

Segmented pinching antenna assisted integrated sensing and communication (ISAC) systems enable flexible spatial resource utilization by allowing different waveguide segments to be dynamically configurโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

From Prompt to Physical Action: Structured Backdoor Attacks on LLM-Mediated Robotic Control Systems

Mingyang Xie, Jin Wei-Kocsis ยท 2026

The integration of large language models (LLMs) into robotic control pipelines enables natural language interfaces that translate user prompts into executable commands. However, this digital-to-physicโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Hypernetwork-Conditioned Reinforcement Learning for Robust Control of Fixed-Wing Aircraft under Actuator Failures

Dennis Marquis, Mazen Farhood ยท 2026

This paper presents a reinforcement learning-based path-following controller for a fixed-wing small uncrewed aircraft system (sUAS) that is robust to certain actuator failures. The controller is condiโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

HyperCT: Low-Rank Hypernet for Unified Chest CT Analysis

Fengbei Liu, Sunwoo Kwak, Hao Phung, Nusrat Binta Nizam, Ilan Richter, Nir Uriel, Hadar Averbuch-Elor, Daborah Estrin, Mert R. Sabuncu ยท 2026

Non-contrast chest CTs offer a rich opportunity for both conventional pulmonary and opportunistic extra-pulmonary screening. While Multi-Task Learning (MTL) can unify these diverse tasks, standard harโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

V2X-QA: A Comprehensive Reasoning Dataset and Benchmark for Multimodal Large Language Models in Autonomous Driving Across Ego, Infrastructure, and Cooperative Views

Junwei You, Pei Li, Zhuoyu Jiang, Weizhe Tang, Zilin Huang, Rui Gan, Jiaxi Liu, Yan Zhao, Sikai Chen, Bin Ran ยท 2026

Multimodal large language models (MLLMs) have shown strong potential for autonomous driving, yet existing benchmarks remain largely ego-centric and therefore cannot systematically assess model performโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Can Hierarchical Cross-Modal Fusion Predict Human Perception of AI Dubbed Content?

Ashwini Dasare, Nirmesh Shah, Ashishkumar Gudmalwar, Pankaj Wasnik ยท 2026

Evaluating AI generated dubbed content is inherently multi-dimensional, shaped by synchronization, intelligibility, speaker consistency, emotional alignment, and semantic context. Human Mean Opinion Sโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Fine-Tuning Large Language Models for Cooperative Tactical Deconfliction of Small Unmanned Aerial Systems

Iman Sharifi, Alex Zongo, Peng Wei ยท 2026

The growing deployment of small Unmanned Aerial Systems (sUASs) in low-altitude airspaces has increased the need for reliable tactical deconfliction under safety-critical constraints. Tactical deconflโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Hybrid Diffusion Model for Breast Ultrasound Image Augmentation

Farhan Fuad Abir, Sanjeda Sara Jennifer, Niloofar Yousefi, Laura J. Brattain ยท 2026

We propose a hybrid diffusion-based augmentation framework to overcome the critical challenge of ultrasound data augmentation in breast ultrasound (BUS) datasets. Unlike conventional diffusion-based aโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

DiT-Flow: Speech Enhancement Robust to Multiple Distortions based on Flow Matching in Latent Space and Diffusion Transformers

Tianyu Cao, Helin Wang, Ari Frummer, Yuval Sieradzki, Adi Arbel, Laureano Moro Velazquez, Jesus Villalba, Oren Gal, Thomas Thebaud, Najim Dehak ยท 2026

Recent advances in generative models, such as diffusion and flow matching, have shown strong performance in audio tasks. However, speech enhancement (SE) models are typically trained on limited dataseโ€ฆ

Read Paper โ†’
Engineering Preprint PDF DOI

Developing an ESG-Oriented Large Language Model through ESG Practices

Gabriel Assis, Ayrton Surica, Pedro Kroll, Gabriela Aires, Darian Rabbani, Edson Bollis, Lucas Pellicer, Aline Paes ยท 2026

Environmental, Social, and Governance (ESG) considerations play a central role in contemporary financial decision-making. In parallel, Large Language Model (LLM) applications in this domain have primaโ€ฆ

Read Paper โ†’
Page 1 of 17 Next โ†’