An examination of audio-visual fused HMMs for speaker recognition
Dean, David B. and Wark, Timothy J. and Sridharan, Sridha (2006) An examination of audio-visual fused HMMs for speaker recognition. In: Second Workshop on Multimodal User Authentication, May 11-12, Toulouse, France.
Full text available as:
| Other - UNSPECIFIED 913Kb | |
| PDF - UNSPECIFIED 170Kb |
Abstract
Fused hidden Markov models (FHMMs) have been shown to work well for the task of audio-visual speaker recognition, but only in an output decision-fusion configuration of both the audio- and video-biased versions of the FHMM structure. This paper looks at the performance of the audioand video-biased versions independently, and shows that the audio-biased version is considerably more capable for speaker recognition. Additionally, this paper shows that by taking advantage of the temporal relationship between the acoustic and visual data, the audio-biased FHMM provides better performance at less processing cost than best-performing output decision-fusion of regular HMMs.
| ID Code: | 5343 |
|---|---|
| Item Type: | Conference Paper |
| Additional URLs : | |
| Keywords : | fused hidden Markov model (FHMM), audio, visual speaker verification |
| Subjects: | Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Pattern Recognition and Data Mining (080109) Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Natural Language Processing (080107) |
| Divisions: | QUT Faculties and Divisions > Faculty of Built Environment and Engineering |
| Copyright Owner : | Copyright 2006 (please consult author) |
| Deposited On: | 24 Oct 2006 |
| Last Modified: | 23 Jan 2009 04:43 |
Export: EndNote | Dublin Core
Repository Staff Only: item control page