TUTCRIS - Tampereen teknillinen yliopisto


Nonnegative Matrix Factorization



OtsikkoAudio Source Separation and Speech Enhancement
ToimittajatEmmanuel Vincent, Tuomas Virtanen, Sharon Gannot
ISBN (elektroninen)978-1-119-27986-0
ISBN (painettu)978-1-119-27989-1
DOI - pysyväislinkit
TilaJulkaistu - 3 elokuuta 2018
OKM-julkaisutyyppiB2 Kirjan tai muun kokoomateoksen osa


Nonnegative matrix factorization (NMF) is a very powerful model for representing speech and music data. In this chapter, we present the mathematical foundations, and describe several probabilistic frameworks and various algorithms for computing an NMF. We also describe some advanced NMF models that are able to more accurately represent audio signals, by enforcing properties such as sparsity, harmonicity and spectral smoothness, and by taking the non‐stationarity of the data into account. We show that coupled factorizations make it possible to exploit some extra information we may have about the observed signal, including the musical score. Finally, we present several methods that perform dictionary learning for NMF, and we conclude about the main benefits and downsides of NMF models.