Tampere University of Technology

TUTCRIS Research Portal

Data augmentation approaches for improving animal audio classification

Research output: Contribution to journalArticleScientificpeer-review

Details

Original languageEnglish
Article number101084
Number of pages8
JournalEcological Informatics
Volume57
DOIs
Publication statusPublished - 2020
Publication typeA1 Journal article-refereed

Abstract

In this paper we present ensembles of classifiers for automated animal audio classification, exploiting different data augmentation techniques for training Convolutional Neural Networks (CNNs). The specific animal audio classification problems are i) birds and ii) cat sounds, whose datasets are freely available. We train five different CNNs on the original datasets and on their versions augmented by four augmentation protocols, working on the raw audio signals or their representations as spectrograms. We compared our best approaches with the state of the art, showing that we obtain the best recognition rate on the same datasets, without ad hoc parameter optimization. Our study shows that different CNNs can be trained for the purpose of animal audio classification and that their fusion works better than the stand-alone classifiers. To the best of our knowledge this is the largest study on data augmentation for CNNs in animal audio classification audio datasets using the same set of classifiers and parameters. Our MATLAB code is available at https://github.com/LorisNanni.

Keywords

  • Acoustic features, Animal audio, Audio classification, Data augmentation, Ensemble of classifiers, Pattern recognition

Publication forum classification

Field of science, Statistics Finland