Romania) · Engineering · Preprint — Research Repository

Engineering Preprint PDF DOI

How Open is Open TTS? A Practical Evaluation of Open Source TTS Tools

Teodora Ragman, Adrian Bogdan Stanea, Horia Cucu, Adriana Stan · 2026

Open-source text-to-speech (TTS) frameworks have emerged as highly adaptable platforms for developing speech synthesis systems across a wide range of languages. However, their applicability is not uni…

Read Paper →

Engineering Preprint PDF DOI

Synthetic Data Pipelines for Adaptive, Mission-Ready Militarized Humanoids

Mohammed Ayman Habib, Aldo Petruzzelli · 2025

Omnia presents a synthetic data driven pipeline to accelerate the training, validation, and deployment readiness of militarized humanoids. The approach converts first-person spatial observations captu…

Read Paper →

Engineering Preprint PDF DOI

Wind Speed Weibull Model Identification in Oman, and Computed Normalized Annual Energy Production (NAEP) From Wind Turbines Based on Data From Weather Stations

Osama A. Marzouk · 2025

Using observation records of wind speeds from weather stations in the Sultanate of Oman between 2000 and 2023, we compute estimators of the two Weibull distribution parameters (namely, the Weibull dis…

Read Paper →

Engineering Preprint PDF DOI

EchoVLA: Synergistic Declarative Memory for VLA-Driven Mobile Manipulation

Min Lin, Xiwen Liang, Bingqian Lin, Liu Jingzhi, Zijian Jiao, Kehan Li, Yu Sun, Weijia Liufu, Yuhan Ma, Yuecheng Liu, Shen Zhao, Yuzheng Zhuang, Xiaodan Liang · 2025

Recent progress in Vision-Language-Action (VLA) models has enabled embodied agents to interpret multimodal instructions and perform complex tasks. However, existing VLAs are mostly confined to short-h…

Read Paper →

Engineering Preprint PDF DOI

Open Source State-Of-the-Art Solution for Romanian Speech Recognition

Gabriel Pirlogeanu, Alexandru-Lucian Georgescu, Horia Cucu · 2025

In this work, we present a new state-of-the-art Romanian Automatic Speech Recognition (ASR) system based on NVIDIA's FastConformer architecture--explored here for the first time in the context of Roma…

Read Paper →

Engineering Preprint PDF DOI

Proposed 2MW Wind Turbine for Use in the Governorate of Dhofar at the Sultanate of Oman

Osama Ahmed Marzouk, Omar Rashid Hamdan Al Badi, Maadh Hamed Salman Al Rashdi, Hamed Mohammed Eid Al Balushi · 2025

In this work, we propose a preliminary design of a horizontal-axis wind turbine (HAWT) as a candidate for the Dhofar Wind Farm project, in the southern Omani Governorate "Dhofar", at the southwest par…

Read Paper →

Engineering Preprint PDF DOI

ROMAN: Open-Set Object Map Alignment for Robust View-Invariant Global Localization

Mason B. Peterson, Yixuan Jia, Yulun Tian, Annika Thomas, Jonathan P. How · 2024

Global localization is a fundamental capability required for long-term and drift-free robot navigation. However, current methods fail to relocalize when faced with significantly different viewpoints. …

Read Paper →

Engineering Preprint PDF DOI

Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch

Teodora Ragman, Adriana Stan · 2024

This paper focuses on adapting the functionalities of the FastPitch model to the Romanian language; extending the set of speakers from one to eighteen; synthesising speech using an anonymous identity;…

Read Paper →

Engineering Preprint PDF DOI

Remastering Divide and Remaster: A Cinematic Audio Source Separation Dataset with Multilingual Support

Karn N. Watcharasupat, Chih-Wei Wu, Iroro Orife · 2024

Cinematic audio source separation (CASS), as a problem of extracting the dialogue, music, and effects stems from their mixture, is a relatively new subtask of audio source separation. To date, only on…

Read Paper →

Engineering Preprint PDF DOI

A Dictionary Based Approach for Removing Out-of-Focus Blur

Uditangshu Aurangabadkar, Anil Kokaram · 2024

The field of image deblurring has seen tremendous progress with the rise of deep learning models. These models, albeit efficient, are computationally expensive and energy consuming. Dictionary based l…

Read Paper →

Engineering Preprint PDF DOI

RObotic MAnipulation Network (ROMAN) -- Hybrid Hierarchical Learning for Solving Complex Sequential Tasks

Eleftherios Triantafyllidis, Fernando Acero, Zhaocheng Liu, Zhibin Li · 2023

Solving long sequential tasks poses a significant challenge in embodied artificial intelligence. Enabling a robotic system to perform diverse sequential tasks with a broad range of manipulation skills…

Read Paper →

Engineering Preprint PDF DOI

Arax: A Runtime Framework for Decoupling Applications from Heterogeneous Accelerators

Manos Pavlidakis, Stelios Mavridis, Antony Chazapis, Giorgos Vasiliadis, Angelos Bilas · 2023

Today, using multiple heterogeneous accelerators efficiently from applications and high-level frameworks, such as TensorFlow and Caffe, poses significant challenges in three respects: (a) sharing acce…

Read Paper →

Engineering Preprint PDF DOI

Estimation of Ground NO2 Measurements from Sentinel-5P Tropospheric Data through Categorical Boosting

Francesco Mauro, Luigi Russo, Fjoralba Janku, Alessandro Sebastianelli, Silvia Liberata Ullo · 2023

This study aims to analyse the Nitrogen Dioxide (NO2) pollution in the Emilia Romagna Region (Northern Italy) during 2019, with the help of satellite retrievals from the Sentinel-5P mission of the Eur…

Read Paper →

Engineering Preprint PDF DOI

Roman: Making Everyday Objects Robotically Manipulable with 3D-Printable Add-on Mechanisms

Jiahao Li, Alexis Samoylov, Jeeeun Kim, Xiang 'Anthony' Chen · 2022

One important vision of robotics is to provide physical assistance by manipulating different everyday objects, e.g., hand tools, kitchen utensils. However, many objects designed for dexterous hand-con…

Read Paper →

Engineering Preprint PDF DOI

Speech Analysis for Automatic Mania Assessment in Bipolar Disorder

P{i}nar Baki, Heysem Kaya, Elvan Ciftci, Huseyin Gulec, Albert Ali Salah · 2022

Bipolar disorder is a mental disorder that causes periods of manic and depressive episodes. In this work, we classify recordings from Bipolar Disorder corpus that contain 7 different tasks, into hypom…

Read Paper →

Engineering Preprint PDF DOI

Source Printer Identification using Printer Specific Pooling of Letter Descriptors

Sharad Joshi, Yogesh Kumar Gupta, Nitin Khanna · 2021

The digital revolution has replaced the use of printed documents with their digital counterparts. However, many applications require the use of both due to several factors, including challenges of dig…

Read Paper →

Engineering Preprint PDF DOI

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Beata Lorincz, Adriana Stan, Mircea Giurgiu · 2021

Multi-speaker spoken datasets enable the creation of text-to-speech synthesis (TTS) systems which can output several voice identities. The multi-speaker (MSPK) scenario also enables the use of fewer t…

Read Paper →

Engineering Preprint PDF DOI

RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications

Adriana Stan · 2020

Deep learning enables the development of efficient end-to-end speech processing applications while bypassing the need for expert linguistic and signal processing features. Yet, recent studies show tha…

Read Paper →

Engineering Preprint PDF DOI

On the suitability of generalized regression neural networks for GNSS position time series prediction for geodetic applications in geodesy and geophysics

M. Kiani · 2020

In this paper, the generalized regression neural network is used to predict the GNSS position time series. Using the IGS 24-hour final solution data for Bad Hamburg permanent GNSS station in Germany, …

Read Paper →

Engineering Preprint PDF DOI

Large scale weakly and semi-supervised learning for low-resource video ASR

Kritika Singh, Vimal Manohar, Alex Xiao, Sergey Edunov, Ross Girshick, Vitaliy Liptchinsky, Christian Fuegen, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed · 2020

Many semi- and weakly-supervised approaches have been investigated for overcoming the labeling cost of building high quality speech recognition systems. On the challenging task of transcribing social …

Read Paper →

Browse Research Papers

How Open is Open TTS? A Practical Evaluation of Open Source TTS Tools

Synthetic Data Pipelines for Adaptive, Mission-Ready Militarized Humanoids

Wind Speed Weibull Model Identification in Oman, and Computed Normalized Annual Energy Production (NAEP) From Wind Turbines Based on Data From Weather Stations

EchoVLA: Synergistic Declarative Memory for VLA-Driven Mobile Manipulation

Open Source State-Of-the-Art Solution for Romanian Speech Recognition

Proposed 2MW Wind Turbine for Use in the Governorate of Dhofar at the Sultanate of Oman

ROMAN: Open-Set Object Map Alignment for Robust View-Invariant Global Localization

Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch

Remastering Divide and Remaster: A Cinematic Audio Source Separation Dataset with Multilingual Support

A Dictionary Based Approach for Removing Out-of-Focus Blur

RObotic MAnipulation Network (ROMAN) -- Hybrid Hierarchical Learning for Solving Complex Sequential Tasks

Arax: A Runtime Framework for Decoupling Applications from Heterogeneous Accelerators

Estimation of Ground NO2 Measurements from Sentinel-5P Tropospheric Data through Categorical Boosting

Roman: Making Everyday Objects Robotically Manipulable with 3D-Printable Add-on Mechanisms

Speech Analysis for Automatic Mania Assessment in Bipolar Disorder

Source Printer Identification using Printer Specific Pooling of Letter Descriptors

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications

On the suitability of generalized regression neural networks for GNSS position time series prediction for geodetic applications in geodesy and geophysics

Large scale weakly and semi-supervised learning for low-resource video ASR

Browse by Category

Research Type

Publish Your Research