Clustered blind beamforming from ad-hoc microphone arrays

Himawan, Ivan, Mccowan, Iain, & Sridharan, Sridha (2010) Clustered blind beamforming from ad-hoc microphone arrays. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), pp. 661-676.

View at publisher


Microphone arrays have been used in various applications to capture conversations, such as in meetings and teleconferences. In many cases, the microphone and likely source locations are known \emph{a priori}, and calculating beamforming filters is therefore straightforward. In ad-hoc situations, however, when the microphones have not been systematically positioned, this information is not available and beamforming must be achieved blindly. In achieving this, a commonly neglected issue is whether it is optimal to use all of the available microphones, or only an advantageous subset of these. This paper commences by reviewing different approaches to blind beamforming, characterising them by the way they estimate the signal propagation vector and the spatial coherence of noise in the absence of prior knowledge of microphone and speaker locations. Following this, a novel clustered approach to blind beamforming is motivated and developed. Without using any prior geometrical information, microphones are first grouped into localised clusters, which are then ranked according to their relative distance from a speaker. Beamforming is then performed using either the closest microphone cluster, or a weighted combination of clusters. The clustered algorithms are compared to the full set of microphones in experiments on a database recorded on different ad-hoc array geometries. These experiments evaluate the methods in terms of signal enhancement as well as performance on a large vocabulary speech recognition task.

Impact and interest:

33 citations in Scopus
21 citations in Web of Science®
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

391 since deposited on 27 Aug 2010
36 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 34235
Item Type: Journal Article
Refereed: Yes
Keywords: Array Signal Processing, Speech Enhancement, Speech Recognition
DOI: 10.1109/TASL.2010.2055560
ISSN: 1558-7916
Subjects: Australian and New Zealand Standard Research Classification > ENGINEERING (090000) > ELECTRICAL AND ELECTRONIC ENGINEERING (090600) > Signal Processing (090609)
Divisions: Past > QUT Faculties & Divisions > Faculty of Built Environment and Engineering
Past > Institutes > Information Security Institute
Past > Schools > School of Engineering Systems
Copyright Owner: Copyright 2010 IEEE
Copyright Statement: Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Deposited On: 27 Aug 2010 01:15
Last Modified: 29 Feb 2012 14:31

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page