QUT ePrints

An examination of audio-visual fused HMMs for speaker recognition

Dean, David B. and Wark, Timothy J. and Sridharan, Sridha (2006) An examination of audio-visual fused HMMs for speaker recognition. In: Second Workshop on Multimodal User Authentication, May 11-12, Toulouse, France.

[img]Microsoft PowerPoint - Presentation (Presentation Slides)
913Kb
[img]PDF - Accepted Version
170Kb

Abstract

Fused hidden Markov models (FHMMs) have been shown to work well for the task of audio-visual speaker recognition, but only in an output decision-fusion configuration of both the audio- and video-biased versions of the FHMM structure. This paper looks at the performance of the audioand video-biased versions independently, and shows that the audio-biased version is considerably more capable for speaker recognition. Additionally, this paper shows that by taking advantage of the temporal relationship between the acoustic and visual data, the audio-biased FHMM provides better performance at less processing cost than best-performing output decision-fusion of regular HMMs.

ID Code:5343
Item Type:Conference Paper
Additional URLs:
Keywords:fused hidden Markov model (FHMM), audio, visual speaker verification
Subjects:Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Pattern Recognition and Data Mining (080109)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Natural Language Processing (080107)
Divisions:QUT Faculties and Divisions > Faculty of Built Environment and Engineering
Copyright Owner:Copyright 2006 (please consult author)
Deposited On:24 Oct 2006
Last Modified:07 Nov 2009 02:25

Export: EndNote | Dublin Core

Staff only: HERDC collection form

Repository Staff Only: item control page