Generalised features for bird vocalisation retrieval in acoustic recordings

Dong, Xueyan, Xie, Jie, Towsey, Michael, Zhang, Jinglan, & Roe, Paul (2015) Generalised features for bird vocalisation retrieval in acoustic recordings. In 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP), Institute of Electrical and Electronics Engineers Inc. (IEEE), Xiamen, China, pp. 1-6.

View at publisher


Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).

Impact and interest:

0 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

40 since deposited on 13 Aug 2015
19 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 86528
Item Type: Conference Paper
Refereed: Yes
Additional Information: INSPEC Accession Number: 15648880
IEEE catalog number: CFP15MSP-AR
Additional URLs:
Keywords: bird vocalisation retrieval, spectral ridge feature, ridge detection, region representation, environmental audio
DOI: 10.1109/MMSP.2015.7340813
ISBN: 9781467374781
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2015 IEEE
Copyright Statement: Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Deposited On: 13 Aug 2015 00:46
Last Modified: 25 Jun 2017 09:02

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page