Deep spatio-temporal features for multimodal emotion recognition

Nguyen Tien, Dung, Nguyen Thanh, Kien, Sridharan, Sridha, Ghasemi Ghaleh Bahmani, Afsaneh, Dean, David, & Fookes, Clinton (2017) Deep spatio-temporal features for multimodal emotion recognition. In Turk, M, Brown, M S, Feris, R, & Sanderson, C (Eds.) Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). Institute of Electrical and Electronics Engineers Inc., United States of America, pp. 1215-1223.

Preview

Accepted Version (PDF 1MB)
292.pdf.

View at publisher

Description

Automatic emotion recognition has attracted great interest and numerous solutions have been proposed, most of which focus either individually on facial expression or acoustic information. While more recent research has considered multimodal approaches, individual modalities are often combined only by simple fusion at the feature and/or decision-level. In this paper, we introduce a novel approach using 3-dimensional convolutional neural networks (C3Ds) to model the spatio-temporal information, cascaded with multimodal deep-belief networks (DBNs) that can represent the audio and video streams. Experiments conducted on the eNTERFACE multimodal emotion database demonstrate this approach leads to improved multimodal emotion recognition performance and signiﬁcantly outperforms recent state-of-the-art.

Impact and interest:

65 citations in Scopus

43 citations in Web of Science®

Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

418 since deposited on 26 Apr 2017

60 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

More statistics...

ID Code:

105854

Item Type:

Chapter in Book, Report or Conference volume (Conference contribution)

ORCID iD:

Nguyen Thanh, Kien	orcid.org/0000-0002-3466-9218
Sridharan, Sridha	orcid.org/0000-0003-4316-9001
Fookes, Clinton	orcid.org/0000-0002-8515-6324

Measurements or Duration:

9 pages

Keywords:

emotion recognition, multimodal emotion recognition

DOI:

10.1109/WACV.2017.140

ISBN:

978-1-5090-4822-9

Pure ID:

33163837

Divisions:

Past > Institutes > Institute for Future Environments
Past > QUT Faculties & Divisions > Science & Engineering Faculty

Funding:

http://purl.org/au-research/grants/arc/DP140100793

Consult author(s) regarding copyright matters

This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au

Deposited On:

26 Apr 2017 03:46

Last Modified:

18 Jul 2024 20:13

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page