QUT ePrints

An examination of audio-visual fused HMMs for speaker recognition

Dean, David B. and Wark, Timothy J. and Sridharan, Sridha (2006) An examination of audio-visual fused HMMs for speaker recognition. In: Second Workshop on Multimodal User Authentication, May 11-12, Toulouse, France.

Full text available as:

[img]Other - UNSPECIFIED
913Kb
[img]PDF - UNSPECIFIED
170Kb

Abstract

Fused hidden Markov models (FHMMs) have been shown to work well for the task of audio-visual speaker recognition, but only in an output decision-fusion configuration of both the audio- and video-biased versions of the FHMM structure. This paper looks at the performance of the audioand video-biased versions independently, and shows that the audio-biased version is considerably more capable for speaker recognition. Additionally, this paper shows that by taking advantage of the temporal relationship between the acoustic and visual data, the audio-biased FHMM provides better performance at less processing cost than best-performing output decision-fusion of regular HMMs.

ID Code:5343
Item Type:Conference Paper
Additional URLs :
Keywords :fused hidden Markov model (FHMM), audio, visual speaker verification
Subjects:Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Pattern Recognition and Data Mining (080109)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Natural Language Processing (080107)
Divisions:QUT Faculties and Divisions > Faculty of Built Environment and Engineering
Copyright Owner :Copyright 2006 (please consult author)
Deposited On:24 Oct 2006
Last Modified:23 Jan 2009 04:43

Export: EndNote | Dublin Core

Repository Staff Only: item control page