Herramienta para reducir automáticamente la duración de un discurso en ingles adaptada a las características de voz de un hablante

The development of the tool was divided into three phases: manual labeling of important audio segments, extraction of audio parameters and system training. In the phase of labeling a web application was implemented in order to speed up the process. Feature extraction was performed with MIRTOOLBOX li...

Full description

Autores:: Alarcón Pedroza, Lebis Armando
Gutiérrez Erazo, José Luis

Tipo de recurso:

Fecha de publicación:: 2015

Institución:: Universidad de San Buenaventura

Repositorio:: Repositorio USB

Idioma:: spa

Description
Summary:	The development of the tool was divided into three phases: manual labeling of important audio segments, extraction of audio parameters and system training. In the phase of labeling a web application was implemented in order to speed up the process. Feature extraction was performed with MIRTOOLBOX library, and the implementation of classifiers and interface was performed using MATLAB. Five classifiers were compared: Linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), Logistic Regression, Artificial neural networks (ANNs) and support vector machines (SVMs), where the best accuracy results were obtained with ANNs: 79.19% and SVMs: 81.21%. Tests were performed to measure the reduction percentage with three new audio. These tests showed an average reduction of 27.34% using ANNs and 24.50% using SVMs. In addition comprehension tests were performed using a reduced audio created by the tool. A 16.67% of information loss was found. It was concluded that the prosodic and spectral parameters provide sufficient data for a classification of relative importance. It was also found that mixing the prosodic and spectral parameters in the same data set provides better accuracy.

Herramienta para reducir automáticamente la duración de un discurso en ingles adaptada a las características de voz de un hablante

Publicaciones similares