Analyzing big environmental audio with frequency preserving autoencoders

Rowe, Benjamin, Eichinski, Philip, Zhang, Jinglan, & Roe, Paul (2021) Analyzing big environmental audio with frequency preserving autoencoders. In Proceedings of the 2021 IEEE 17th International Conference on eScience (eScience). Institute of Electrical and Electronics Engineers Inc., United States of America, pp. 70-79.

Preview

Submitted Version (PDF 594kB)
105816771.

View at publisher

Description

Continuous audio recordings are playing an ever more important role in conservation and biodiversity monitoring, however, listening to these recordings is often infeasible, as they can be thousands of hours long. Automating analysis using machine learning is in high demand. However, these algorithms require a feature representation. Several methods for generating feature representations for these data have been developed, using techniques such as domain-specific features and deep learning. However, domain-specific features are unlikely to be an ideal representation of the data and deep learning methods often require extensively labeled data.In this paper, we propose a method for generating a frequency-preserving autoencoder-based feature representation for unlabeled ecological audio. We evaluate multiple frequency-preserving autoencoder-based feature representations using a hierarchical clustering sample task. We compare this to a basic autoencoder feature representation, MFCC, and spectral acoustic indices. Experimental results show that some of these non-square autoencoder architectures compare well to these existing feature representations.This novel method for generating a feature representation for unlabeled ecological audio will offer a fast, general way for ecologists to generate a feature representation of their audio, which does not require extensively labeled data.

Impact and interest:

1 citations in Scopus

Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

65 since deposited on 17 Feb 2022

40 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

More statistics...

ID Code:

228402

Item Type:

Chapter in Book, Report or Conference volume (Conference contribution)

Series Name:

Proceedings - IEEE 17th International Conference on eScience, eScience 2021

ORCID iD:

Zhang, Jinglan	orcid.org/0000-0001-6459-2963
Roe, Paul	orcid.org/0000-0002-4892-1509

Measurements or Duration:

10 pages

Keywords:

Autoencoder, Deep Learning, Ecoacoustics, Machine Learning

DOI:

10.1109/eScience51609.2021.00017

ISBN:

978-1-6654-4708-9

Pure ID:

105816771

Divisions:

Current > Research Centres > Centre for Data Science
Current > Research Centres > Centre for the Environment
Current > QUT Faculties and Divisions > Faculty of Science
Current > Schools > School of Computer Science

2021 IEEE

This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au

Deposited On:

17 Feb 2022 02:41

Last Modified:

01 Mar 2024 00:17

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page

CORE (COnnecting REpositories)