Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach
Ghaemmaghami, Houman, Dean, David, Vogt, Robbie, & Sridharan, Sridha (2012) Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach. In ICASSP 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Kyoto International Conference Centre, Kyoto, Japan, pp. 4185-4188.
Administrators only | Request a copy from author
In this paper we propose and evaluate a speaker attribution system using a complete-linkage clustering method. Speaker attribution refers to the annotation of a collection of spoken audio based on speaker identities. This can be achieved using diarization and speaker linking. The main challenge associated with attribution is achieving computational efficiency when dealing with large audio archives. Traditional agglomerative clustering methods with model merging and retraining are not feasible for this purpose. This has motivated the use of linkage clustering methods without retraining. We first propose a diarization system using complete-linkage clustering and show that it outperforms traditional agglomerative and single-linkage clustering based diarization systems with a relative improvement of 40% and 68%, respectively. We then propose a complete-linkage speaker linking system to achieve attribution and demonstrate a 26% relative improvement in attribution error rate (AER) over the single-linkage speaker linking approach.
Impact and interest:
Citation counts are sourced monthly from and citation databases.
These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
|Item Type:||Conference Paper|
|Keywords:||speaker diarization, speaker linking, speaker attribution, complete-linkage, joint factor analysis|
|Subjects:||Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Natural Language Processing (080107)
Australian and New Zealand Standard Research Classification > ENGINEERING (090000)
|Divisions:||Current > QUT Faculties and Divisions > Science & Engineering Faculty|
|Copyright Owner:||Copyright 2012 Institute of Electrical and Electronics Engineers, Inc.|
|Deposited On:||19 Feb 2013 02:09|
|Last Modified:||03 Dec 2014 22:05|
Repository Staff Only: item control page