28+ open-access research outputs.
Open-source text-to-speech (TTS) frameworks have emerged as highly adaptable platforms for developing speech synthesis systems across a wide range of languages. However, their applicability is not uni…
Omnia presents a synthetic data driven pipeline to accelerate the training, validation, and deployment readiness of militarized humanoids. The approach converts first-person spatial observations captu…
Using observation records of wind speeds from weather stations in the Sultanate of Oman between 2000 and 2023, we compute estimators of the two Weibull distribution parameters (namely, the Weibull dis…
Recent progress in Vision-Language-Action (VLA) models has enabled embodied agents to interpret multimodal instructions and perform complex tasks. However, existing VLAs are mostly confined to short-h…
In this work, we present a new state-of-the-art Romanian Automatic Speech Recognition (ASR) system based on NVIDIA's FastConformer architecture--explored here for the first time in the context of Roma…
In this work, we propose a preliminary design of a horizontal-axis wind turbine (HAWT) as a candidate for the Dhofar Wind Farm project, in the southern Omani Governorate "Dhofar", at the southwest par…
Global localization is a fundamental capability required for long-term and drift-free robot navigation. However, current methods fail to relocalize when faced with significantly different viewpoints. …
This paper focuses on adapting the functionalities of the FastPitch model to the Romanian language; extending the set of speakers from one to eighteen; synthesising speech using an anonymous identity;…
Cinematic audio source separation (CASS), as a problem of extracting the dialogue, music, and effects stems from their mixture, is a relatively new subtask of audio source separation. To date, only on…
The field of image deblurring has seen tremendous progress with the rise of deep learning models. These models, albeit efficient, are computationally expensive and energy consuming. Dictionary based l…
Solving long sequential tasks poses a significant challenge in embodied artificial intelligence. Enabling a robotic system to perform diverse sequential tasks with a broad range of manipulation skills…
Today, using multiple heterogeneous accelerators efficiently from applications and high-level frameworks, such as TensorFlow and Caffe, poses significant challenges in three respects: (a) sharing acce…
This study aims to analyse the Nitrogen Dioxide (NO2) pollution in the Emilia Romagna Region (Northern Italy) during 2019, with the help of satellite retrievals from the Sentinel-5P mission of the Eur…
One important vision of robotics is to provide physical assistance by manipulating different everyday objects, e.g., hand tools, kitchen utensils. However, many objects designed for dexterous hand-con…
Bipolar disorder is a mental disorder that causes periods of manic and depressive episodes. In this work, we classify recordings from Bipolar Disorder corpus that contain 7 different tasks, into hypom…
The digital revolution has replaced the use of printed documents with their digital counterparts. However, many applications require the use of both due to several factors, including challenges of dig…
Multi-speaker spoken datasets enable the creation of text-to-speech synthesis (TTS) systems which can output several voice identities. The multi-speaker (MSPK) scenario also enables the use of fewer t…
Deep learning enables the development of efficient end-to-end speech processing applications while bypassing the need for expert linguistic and signal processing features. Yet, recent studies show tha…
In this paper, the generalized regression neural network is used to predict the GNSS position time series. Using the IGS 24-hour final solution data for Bad Hamburg permanent GNSS station in Germany, …
Many semi- and weakly-supervised approaches have been investigated for overcoming the labeling cost of building high quality speech recognition systems. On the challenging task of transcribing social …
Free open-access publishing with Google Scholar indexing.
Submission Guide →