4.7 Article

An FPGA-Based Embedded Robust Speech Recognition System Designed by Combining Empirical Mode Decomposition and a Genetic Algorithm

Journal

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Volume 61, Issue 9, Pages 2560-2572

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIM.2012.2190344

Keywords

Embedded system; empirical mode decomposition (EMD); field-programmable gate array (FPGA); genetic algorithm (GA); robust speech recognition; System-On-a-Chip (SOC) architecture

Funding

  1. National Science Council of the Republic of China [NSC 100-2221-E-390-025-MY2]

Ask authors/readers for more resources

A field-programmable gate array (FPGA)-based robust speech measurement and recognition system is the focus of this paper, and the environmental noise problem is its main concern. To accelerate the recognition speed of the FPGA-based speech recognition system, the discrete hidden Markov model is used here to lessen the computation burden inherent in speech recognition. Furthermore, the empirical mode decomposition is used to decompose the measured speech signal contaminated by noise into several intrinsic mode functions (IMFs). The IMFs are then weighted and summed to reconstruct the original clean speech signal. Unlike previous research, in which IMFs were selected by trial and error for specific applications, the weights for each IMF are designed by the genetic algorithm to obtain an optimal solution. The experimental results in this paper reveal that this method achieves a better speech recognition rate for speech subject to various environmental noises. Moreover, this paper also explores the hardware realization of the designed speech measurement and recognition systems on an FPGA-based embedded system with the System-On-a-Chip (SOC) architecture. Since the central-processing-unit core adopted in the SOC has limited computation ability, this paper uses the integer fast Fourier transform (FFT) to replace the floating-point FFT to speed up the computation for capturing speech features through a mel-frequency cepstrum coefficient. The result is a significant reduction in the calculation time without influencing the speech recognition rate. It can be seen from the experiments in this paper that the performance of the implemented hardware is significantly better than that of existing research.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available