I-vector based speaker recognition using advanced channel compensation techniques

Kanagasundaram, Ahilan, Dean, David, Sridharan, Sridha, McLaren, Mitchell L., & Vogt, Robbie (2014) I-vector based speaker recognition using advanced channel compensation techniques. Computer Speech and Language, 28(1), pp. 121-140.

View at publisher

Abstract

This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques:

(a) weighted maximum margin criterion (WMMC),

(b) source-normalized WMMC (SN-WMMC),

(c) weighted linear discriminant analysis (WLDA), and;

(d) source-normalized WLDA (SN-WLDA) have been investigated.

We show that, by extracting the discriminatory information between pairs of speakers as well as capturing the source variation information in the development i-vector space, the SN-WLDA based cosine similarity scoring (CSS) i-vector system is shown to provide over 20% improvement in EER for NIST 2008 interview and microphone verification and over 10% improvement in EER for NIST 2008 telephone verification, when compared to SN-LDA based CSS i-vector system. Further, score-level fusion techniques are analyzed to combine the best channel compensation approaches, to provide over 8% improvement in DCF over the best single approach, (SN-WLDA), for NIST 2008 interview/ telephone enrolment-verification condition. Finally, we demonstrate that the improvements found in the context of CSS also generalize to state-of-the-art GPLDA with up to 14% relative improvement in EER for NIST SRE 2010 interview and microphone verification and over 7% relative improvement in EER for NIST SRE 2010 telephone verification.

Impact and interest:

6 citations in Scopus
Search Google Scholar™
6 citations in Web of Science®

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

51 since deposited on 01 May 2013
24 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 59522
Item Type: Journal Article
Refereed: Yes
Keywords: Speaker verification, I-vector, GPLDA, LDA, SN-LDA, WLDA, SN-WLDA
DOI: 10.1016/j.csl.2013.04.002
ISSN: 0885-2308
Divisions: Current > Schools > School of Electrical Engineering & Computer Science
Past > Institutes > Information Security Institute
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2013 Elsevier
Deposited On: 01 May 2013 01:05
Last Modified: 04 Jan 2016 02:15

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page