QUT ePrints

Robust speaker verification via fusion of speech and lip modalities

Wark, T., Sridharan, S., & Chandran, V. (1999) Robust speaker verification via fusion of speech and lip modalities. In Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, Phoenix, Arizona, pp. 3061-3064.

[img] Published Version (PDF 437kB)
Administrators only | Request a copy from author

    View at publisher

    Abstract

    This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise

    Impact and interest:

    12 citations in Scopus
    Search Google Scholar™
    0 citations in Web of Science®

    Citation countsare sourced monthly from Scopus and Web of Science® citation databases.

    These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

    Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

    ID Code: 45590
    Item Type: Conference Paper
    Keywords: acoustic noise, audio-visual systems, feature extraction, gesture recognition, sensor fusion, speaker recognition, background noise, error rates, false acceptance, false rejection, features extraction, fusion, moving lips, performance, robust speaker verification, speech features, speech information, weighting
    DOI: 10.1109/ICASSP.1999.757487
    ISBN: 0780350413
    ISSN: 1520-6149
    Divisions: Past > QUT Faculties & Divisions > Faculty of Built Environment and Engineering
    Past > Schools > School of Engineering Systems
    Copyright Owner: Copyright 1999 IEEE
    Copyright Statement: Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
    Deposited On: 17 Oct 2011 11:54
    Last Modified: 17 Oct 2011 11:54

    Export: EndNote | Dublin Core | BibTeX

    Repository Staff Only: item control page