TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

Low-Latency Sound-Source-Separation using Non-Negative Matrix Factorisation with Coupled Analysis and Synthesis Dictionaries

Tutkimustuotosvertaisarvioitu

Yksityiskohdat

AlkuperäiskieliEnglanti
Otsikko2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
KustantajaIEEE
Sivut241-245
Sivumäärä5
ISBN (painettu)9781467369978
DOI - pysyväislinkit
TilaJulkaistu - 4 elokuuta 2015
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaIEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING -
Kesto: 1 tammikuuta 19001 tammikuuta 2000

Conference

ConferenceIEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING
Ajanjakso1/01/001/01/00

Tiivistelmä

For real-time or close to real-time applications, sound source separation can be performed on-line, where new frames of incoming data for a mixture signal are processed as they arrive, at very low delay. We propose an approach which generates the separation filters for short synthesis frames to achieve low latency source separation, based on a compositional model mixture of the audio to be separated. Filter parameters are derived from a longer temporal context than the current processing frame through use of a longer analysis frame. A pair of dictionaries are used, one for analysis and one for reconstruction. With this approach we are able to increase separation performance at low latencies whilst retaining the low-latency provided by the use of short synthesis frames. The proposed data handling scheme and parameters can be adjusted to achieve real-time performance, given sufficient computational power. Low-latency output allows a human listener to use the results of such a separation scheme directly, without a perceptible delay. With the proposed method, separated source-to-distortion ratios (SDRs) can be improved by over 1 dB for latencies below 20 ms, without any affect on latency.