QUT ePrints

FPGA implementation of dual-microphone delay-and-sum beamforming for in-car speech enhancement and recognition

Ye, Harvey, Whittington, Jim, Himawan, Ivan, Kleinschmidt, Tristan, & Mason, Michael (2009) FPGA implementation of dual-microphone delay-and-sum beamforming for in-car speech enhancement and recognition. In AutoCRC Conference 2009 : Conference Proceedings, Cooperative Research Centre for Advanced Automotive Technology, Melbourne Convention and Exhibition Centre, Melbourne, Victoria.

Abstract

In an automotive environment, the performance of a speech recognition system is affected by environmental noise if the speech signal is acquired directly from a microphone. Speech enhancement techniques are therefore necessary to improve the speech recognition performance. In this paper, a field-programmable gate array (FPGA) implementation of dual-microphone delay-and-sum beamforming (DASB) for speech enhancement is presented. As the first step towards a cost-effective solution, the implementation described in this paper uses a relatively high-end FPGA device to facilitate the verification of various design strategies and parameters. Experimental results show that the proposed design can produce output waveforms close to those generated by a theoretical (floating-point) model with modest usage of FPGA resources. Speech recognition experiments are also conducted on enhanced in-car speech waveforms produced by the FPGA in order to compare recognition performance with the floating-point representation running on a PC.

Impact and interest:

Citation countsare sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

520 since deposited on 16 Mar 2010
85 in the past twelve months

Full-text downloadsdisplays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 31339
Item Type: Conference Paper
Additional URLs:
Keywords: field programmable gate arrays, array signal processing, speech enhancement, speech recognition
ISBN: 9780646509952
Subjects: Australian and New Zealand Standard Research Classification > ENGINEERING (090000) > ELECTRICAL AND ELECTRONIC ENGINEERING (090600) > Signal Processing (090609)
Australian and New Zealand Standard Research Classification > ENGINEERING (090000) > ELECTRICAL AND ELECTRONIC ENGINEERING (090600) > Circuits and Systems (090601)
Divisions: Past > QUT Faculties & Divisions > Faculty of Built Environment and Engineering
Past > Institutes > Information Security Institute
Past > Schools > School of Engineering Systems
Copyright Owner: Copyright 2009 [please consult the authors]
Deposited On: 17 Mar 2010 08:16
Last Modified: 01 Mar 2012 00:13

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page