Image processing and classification procedure for the analysis of Australian frog vocalisations

Xie, Jie, Towsey, Michael, Zhang, Jinglan, & Roe, Paul (2015) Image processing and classification procedure for the analysis of Australian frog vocalisations. In Proceedings of the 2nd International Workshop on Environmental Multimedia Retrieval, ACM, Shanghai, China, pp. 15-20.

[img] Accepted Version (PDF 609kB)
Available to QUT staff and students only | Request a copy from author

View at publisher

Abstract

Frog species have been declining worldwide at unprecedented rates in the past decades. There are many reasons for this decline including pollution, habitat loss, and invasive species [1]. To preserve, protect, and restore frog biodiversity, it is important to monitor and assess frog species. In this paper, a novel method using image processing techniques for analyzing Australian frog vocalisations is proposed. An FFT is applied to audio data to produce a spectrogram. Then, acoustic events are detected and isolated into corresponding segments through image processing techniques applied to the spectrogram. For each segment, spectral peak tracks are extracted with selected seeds and a region growing technique is utilised to obtain the contour of each frog vocalisation. Based on spectral peak tracks and the contour of each frog vocalisation, six feature sets are extracted. Principal component analysis reduces each feature set down to six principal components which are tested for classification performance with a k-nearest neighbor classifier. This experiment tests the proposed method of classification on fourteen frog species which are geographically well distributed throughout Queensland, Australia. The experimental results show that the best average classification accuracy for the fourteen frog species can be up to 87%.

Impact and interest:

2 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 89671
Item Type: Conference Paper
Refereed: Yes
Additional URLs:
Keywords: audio data classification, image processing, k-nearest neighbor classifier, region growing, spectral peak track
DOI: 10.1145/2764873.2764878
ISBN: 9781450335584
Subjects: Australian and New Zealand Standard Research Classification > ENVIRONMENTAL SCIENCES (050000) > ECOLOGICAL APPLICATIONS (050100) > Ecological Applications not elsewhere classified (050199)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Image Processing (080106)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > INFORMATION SYSTEMS (080600)
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2015 ACM
Deposited On: 29 Oct 2015 23:00
Last Modified: 04 Nov 2015 10:42

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page