hosted by
publicationslist.org
    
Iosif Mporas

imporas@upatras.gr

Journal articles

2008
M Siafarikas, I Mporas, T Ganchev, N Fakotakis (2008)  Speech Recognition Using Wavelet Packet Features.   Journal of Wavelet Theory and Applications, ISSN 0973-6336  
Abstract: In view of the growing use of automatic speech recognition in the modern society, we study various alternative representations of the speech signal that have the potential to contribute to the improvement of the recognition performance. Specifically, the main targets of the present article are to overview and evaluate the practical importance of some recently proposed, and thus less studied, wavelet packet-based speech parameterization methods on the speech recognition task, illustrating their merits compared to other well known approaches. To this end, working on the widely acknowledged TIMIT (Texas Instruments and Massachusetts Institute of Technology) speech database and relying on the Sphinx-III speech recognizer, we contrast the performance of four wavelet packet-based speech parameterizations against traditional Fourier-based techniques that have been considered for the task of speech recognition for over two decades, including Mel Frequency Cepstral Coefficients (MFCC) and Perceptual Linear Predictive (PLP) cepstral coefficients that presently dominate the speech recognition field. The experimental results demonstrate that the wavelet packet-based speech features of interest provide a superior performance over the baseline parameters. This validates the wavelet packet-based speech parameterization schemes as a promising research direction that could bring further reduction of the speech recognition error rate.
Notes:
2007
I Mporas, T Ganchev, M Siafarikas, N Fakotakis (2007)  Comparison of Speech Features on the Speech Recognition Task.   Journal of Computer Science, ISSN 1549-3636, 3: 8. 608-616  
Abstract: In the present work we overview some recently proposed discrete Fourier transform (DFT)- and discrete wavelet packet transform (DWPT)-based speech parameterization methods and evaluate their performance on the speech recognition task. Specifically, in order to assess the practical value of these less studied speech parameterization methods, we evaluate them in a common experimental setup and compare their performance against traditional techniques, such as the Mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive (PLP) cepstral coefficients which presently dominate the speech recognition field. In particular, utilizing the well established TIMIT speech corpus and employing the Sphinx-III speech recognizer, we present comparative results of 8 different speech parameterization techniques.
Notes:

Conference papers

2007
Powered by publicationslist.org.