JFA based speaker recognition using delta-phase and MFCC features

Kanagasundaram, Ahilan, Dean, David, & Sridharan, Sridha (2012) JFA based speaker recognition using delta-phase and MFCC features. In Cox, F, Lin, S, Shaw, J, Yuen, I, Miles, K, Demuth, K, et al. (Eds.) Speech Science and Technology 2012: Proceedings of the 14th Australasian International Conference on Speech Science and Technology. The Australasian Speech Science and Technology Association (ASSTA), Australia, pp. 1-4.

Preview

Accepted Version (PDF 105kB)
SST_2012_paper.pdf.

View at publisher

Description

This paper investigates the use of mel-frequency deltaphase (MFDP) features in comparison to, and in fusion with, traditional mel-frequency cepstral coefficient (MFCC) features within joint factor analysis (JFA) speaker verification. MFCC features, commonly used in speaker recognition systems, are derived purely from the magnitude spectrum, with the phase spectrum completely discarded. In this paper, we investigate if features derived from the phase spectrum can provide additional speaker discriminant information to the traditional MFCC approach in a JFA based speaker verification system. Results are presented which provide a comparison of MFCC-only, MFDPonly and score fusion of the two approaches within a JFA speaker verification approach. Based upon the results presented using the NIST 2008 Speaker Recognition Evaluation (SRE) dataset, we believe that, while MFDP features alone cannot compete with MFCC features, MFDP can provide complementary information that result in improved speaker verification performance when both approaches are combined in score fusion, particularly in the case of shorter utterances.

Impact and interest:

Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

205 since deposited on 11 Dec 2012

25 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

More statistics...

ID Code:

55511

Item Type:

Chapter in Book, Report or Conference volume (Conference contribution)

ORCID iD:

Sridharan, Sridha	orcid.org/0000-0003-4316-9001

Measurements or Duration:

4 pages

Event Title:

Australasian International Conference on Speech Science and Technology

Event Dates:

2012-12-03 - 2012-12-06

Event Location:

Australia

ISBN:

1039-0227

Pure ID:

32297641

Divisions:

Past > QUT Faculties & Divisions > Science & Engineering Faculty

This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au

Deposited On:

12 Dec 2012 09:49

Last Modified:

17 Nov 2025 07:18

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page