Multiple-instance multiple-label learning for the classification of frog calls with acoustic event detection

Xie, Jie, Towsey, Michael, Zhang, Liang, Yasumiba, Kiyomi, Schwarzkopf, Lin, Zhang, Jinglan, & Roe, Paul (2016) Multiple-instance multiple-label learning for the classification of frog calls with acoustic event detection. In Mansouri, Alamin, Nouboud, Fathallah, Chalifour, Alain, Mammass, Driss, Meunier, Jean, & Elmoataz, Abderrahim (Eds.) Image and Signal Processing: 7th International Conference, ICISP 2016, Trois-Rivières, QC, Canada, May 30 - June 1, 2016, Proceedings, Springer International Publishing, Québec, Canada, pp. 222-230.

View at publisher


Frog call classification has received increasing attention due to its importance for ecosystem. Traditionally, the classification of frog calls is solved by means of the single-instance single-label classification classifier. However, since different frog species tend to call simultaneously, classifying frog calls becomes a multiple-instance multiple-label learning problem. In this paper, we propose a novel method for the classification of frog species using multiple-instance multiple-label (MIML) classifiers. To be specific, continuous recordings are first segmented into audio clips (10 seconds). For each audio clip, acoustic event detection is used to segment frog syllables. Then, three feature sets are extracted from each syllable: mask descriptor, profile statistics, and the combination of mask descriptor and profile statistics. Next, a bag generator is applied to those extracted features. Finally, three MIML classifiers, MIML-SVM, MIML-RBF, and MIML-kNN, are employed for tagging each audio clip with different frog species. Experimental results show that our proposed method can achieve high accuracy (81.8% true positive/negatives) for frog call classification.

Impact and interest:

0 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

2 since deposited on 19 Jun 2016
1 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 96252
Item Type: Conference Paper
Refereed: Yes
Additional Information: Volume 9680 of the series Lecture Notes in Computer Science
Additional URLs:
Keywords: Frog call classification, Acoustic event detection, Multiple-instance multiple-label learning
DOI: 10.1007/978-3-319-33618-3_23
ISBN: 9783319336176
ISSN: 0302-9743
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2016 Springer International Publishing Switzerland
Copyright Statement: The final publication is available at Springer via
Deposited On: 19 Jun 2016 23:23
Last Modified: 24 Jun 2017 11:28

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page