FPGA implementation of dual-microphone delay-and-sum beamforming for in-car speech enhancement and recognition
Ye, Harvey, Whittington, Jim, Himawan, Ivan, Kleinschmidt, Tristan, & Mason, Michael (2009) FPGA implementation of dual-microphone delay-and-sum beamforming for in-car speech enhancement and recognition. In AutoCRC Conference 2009 : Conference Proceedings, Cooperative Research Centre for Advanced Automotive Technology, Melbourne Convention and Exhibition Centre, Melbourne, Victoria.
In an automotive environment, the performance of a speech recognition system is affected by environmental noise if the speech signal is acquired directly from a microphone. Speech enhancement techniques are therefore necessary to improve the speech recognition performance. In this paper, a field-programmable gate array (FPGA) implementation of dual-microphone delay-and-sum beamforming (DASB) for speech enhancement is presented. As the first step towards a cost-effective solution, the implementation described in this paper uses a relatively high-end FPGA device to facilitate the verification of various design strategies and parameters. Experimental results show that the proposed design can produce output waveforms close to those generated by a theoretical (floating-point) model with modest usage of FPGA resources. Speech recognition experiments are also conducted on enhanced in-car speech waveforms produced by the FPGA in order to compare recognition performance with the floating-point representation running on a PC.
Citation countsare sourced monthly fromand citation databases.
These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science generally from 1980 onwards.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
Full-text downloadsdisplays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.
|Item Type:||Conference Paper|
|Keywords:||field programmable gate arrays, array signal processing, speech enhancement, speech recognition|
|Subjects:||Australian and New Zealand Standard Research Classification > ENGINEERING (090000) > ELECTRICAL AND ELECTRONIC ENGINEERING (090600) > Signal Processing (090609)|
Australian and New Zealand Standard Research Classification > ENGINEERING (090000) > ELECTRICAL AND ELECTRONIC ENGINEERING (090600) > Circuits and Systems (090601)
|Divisions:||Past > QUT Faculties & Divisions > Faculty of Built Environment and Engineering|
Past > Institutes > Information Security Institute
Past > Schools > School of Engineering Systems
|Copyright Owner:||Copyright 2009 [please consult the authors]|
|Deposited On:||17 Mar 2010 08:16|
|Last Modified:||01 Mar 2012 00:13|
Repository Staff Only: item control page