TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

Deep Neural Networks for Dynamic Range Compression in Mastering Applications

Tutkimustuotos

Standard

Deep Neural Networks for Dynamic Range Compression in Mastering Applications. / Mimilakis, Stylianos Ioannis; Drossos, Konstantinos; Virtanen, Tuomas; Schuller, Gerald.

Audio Engineering Society Convention 140. AES Audio Engineering Society, 2016.

Tutkimustuotos

Harvard

Mimilakis, SI, Drossos, K, Virtanen, T & Schuller, G 2016, Deep Neural Networks for Dynamic Range Compression in Mastering Applications. julkaisussa Audio Engineering Society Convention 140. AES Audio Engineering Society.

APA

Mimilakis, S. I., Drossos, K., Virtanen, T., & Schuller, G. (2016). Deep Neural Networks for Dynamic Range Compression in Mastering Applications. teoksessa Audio Engineering Society Convention 140 AES Audio Engineering Society.

Vancouver

Mimilakis SI, Drossos K, Virtanen T, Schuller G. Deep Neural Networks for Dynamic Range Compression in Mastering Applications. julkaisussa Audio Engineering Society Convention 140. AES Audio Engineering Society. 2016

Author

Mimilakis, Stylianos Ioannis ; Drossos, Konstantinos ; Virtanen, Tuomas ; Schuller, Gerald. / Deep Neural Networks for Dynamic Range Compression in Mastering Applications. Audio Engineering Society Convention 140. AES Audio Engineering Society, 2016.

Bibtex - Lataa

@inproceedings{ac44d7e1ebe840f68ba833aaa780e894,
title = "Deep Neural Networks for Dynamic Range Compression in Mastering Applications",
abstract = "The process of audio mastering often, if not always, includes various audio signal processing techniques such as frequency equalization and dynamic range compression. With respect to the genre and style of the audio content, the parameters of these techniques are controlled by a mastering engineer, in order to process the original audio material. This operation relies on musical and perceptually pleasing facets of the perceived acoustic characteristics, transmitted from the audio material under the mastering process. Modeling such dynamic operations, which involve adaptation regarding the audio content, becomes vital in automated applications since it significantly affects the overall performance. In this work we present a system capable of modelling such behavior focusing on the automatic dynamic range compression. It predicts frequency coefficients that allow the dynamic range compression, via a trained deep neural network, and applies them to unmastered audio signal served as input. Both dynamic range compression and the prediction of the corresponding frequency coefficients take place inside the time-frequency domain, using magnitude spectra acquired from a critical band filter bank, similar to humans’ peripheral auditory system. Results from conducted listening tests, incorporating professional music producers and audio mastering engineers, demonstrate on average an equivalent performance compared to professionally mastered audio content. Improvements were also observed when compared to relevant and commercial software.",
author = "Mimilakis, {Stylianos Ioannis} and Konstantinos Drossos and Tuomas Virtanen and Gerald Schuller",
year = "2016",
month = "5",
language = "English",
booktitle = "Audio Engineering Society Convention 140",
publisher = "AES Audio Engineering Society",

}

RIS (suitable for import to EndNote) - Lataa

TY - GEN

T1 - Deep Neural Networks for Dynamic Range Compression in Mastering Applications

AU - Mimilakis, Stylianos Ioannis

AU - Drossos, Konstantinos

AU - Virtanen, Tuomas

AU - Schuller, Gerald

PY - 2016/5

Y1 - 2016/5

N2 - The process of audio mastering often, if not always, includes various audio signal processing techniques such as frequency equalization and dynamic range compression. With respect to the genre and style of the audio content, the parameters of these techniques are controlled by a mastering engineer, in order to process the original audio material. This operation relies on musical and perceptually pleasing facets of the perceived acoustic characteristics, transmitted from the audio material under the mastering process. Modeling such dynamic operations, which involve adaptation regarding the audio content, becomes vital in automated applications since it significantly affects the overall performance. In this work we present a system capable of modelling such behavior focusing on the automatic dynamic range compression. It predicts frequency coefficients that allow the dynamic range compression, via a trained deep neural network, and applies them to unmastered audio signal served as input. Both dynamic range compression and the prediction of the corresponding frequency coefficients take place inside the time-frequency domain, using magnitude spectra acquired from a critical band filter bank, similar to humans’ peripheral auditory system. Results from conducted listening tests, incorporating professional music producers and audio mastering engineers, demonstrate on average an equivalent performance compared to professionally mastered audio content. Improvements were also observed when compared to relevant and commercial software.

AB - The process of audio mastering often, if not always, includes various audio signal processing techniques such as frequency equalization and dynamic range compression. With respect to the genre and style of the audio content, the parameters of these techniques are controlled by a mastering engineer, in order to process the original audio material. This operation relies on musical and perceptually pleasing facets of the perceived acoustic characteristics, transmitted from the audio material under the mastering process. Modeling such dynamic operations, which involve adaptation regarding the audio content, becomes vital in automated applications since it significantly affects the overall performance. In this work we present a system capable of modelling such behavior focusing on the automatic dynamic range compression. It predicts frequency coefficients that allow the dynamic range compression, via a trained deep neural network, and applies them to unmastered audio signal served as input. Both dynamic range compression and the prediction of the corresponding frequency coefficients take place inside the time-frequency domain, using magnitude spectra acquired from a critical band filter bank, similar to humans’ peripheral auditory system. Results from conducted listening tests, incorporating professional music producers and audio mastering engineers, demonstrate on average an equivalent performance compared to professionally mastered audio content. Improvements were also observed when compared to relevant and commercial software.

M3 - Conference contribution

BT - Audio Engineering Society Convention 140

PB - AES Audio Engineering Society

ER -