Tampere University of Technology

TUTCRIS Research Portal

Speech Detection on Broadcast Audio

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Standard

Speech Detection on Broadcast Audio. / Zubari, Unal; Ozan, Ezgi Can; Acar, Banu Oskay; Ciloglu, Tolga; Esen, Ersin; Ates, Tugrul K.; Onur, Duygu Oskay.

18TH European Signal Processing Conference (EUSIPCO-2010). ed. / B Kleijn; J Larsen. KESSARIANI : EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP, 2010. p. 85-89 (European Signal Processing Conference; Vol. 18).

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Harvard

Zubari, U, Ozan, EC, Acar, BO, Ciloglu, T, Esen, E, Ates, TK & Onur, DO 2010, Speech Detection on Broadcast Audio. in B Kleijn & J Larsen (eds), 18TH European Signal Processing Conference (EUSIPCO-2010). European Signal Processing Conference, vol. 18, EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP, KESSARIANI, pp. 85-89, 18th European Signal Processing Conference (EUSIPCO), Denmark, 23/08/10.

APA

Zubari, U., Ozan, E. C., Acar, B. O., Ciloglu, T., Esen, E., Ates, T. K., & Onur, D. O. (2010). Speech Detection on Broadcast Audio. In B. Kleijn, & J. Larsen (Eds.), 18TH European Signal Processing Conference (EUSIPCO-2010) (pp. 85-89). (European Signal Processing Conference; Vol. 18). KESSARIANI: EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP.

Vancouver

Zubari U, Ozan EC, Acar BO, Ciloglu T, Esen E, Ates TK et al. Speech Detection on Broadcast Audio. In Kleijn B, Larsen J, editors, 18TH European Signal Processing Conference (EUSIPCO-2010). KESSARIANI: EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP. 2010. p. 85-89. (European Signal Processing Conference).

Author

Zubari, Unal ; Ozan, Ezgi Can ; Acar, Banu Oskay ; Ciloglu, Tolga ; Esen, Ersin ; Ates, Tugrul K. ; Onur, Duygu Oskay. / Speech Detection on Broadcast Audio. 18TH European Signal Processing Conference (EUSIPCO-2010). editor / B Kleijn ; J Larsen. KESSARIANI : EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP, 2010. pp. 85-89 (European Signal Processing Conference).

Bibtex - Download

@inproceedings{82a68bd9e70a447b9782129b94e9abab,
title = "Speech Detection on Broadcast Audio",
abstract = "Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-speech via Gaussian Mixture Model (GMM) based classification. GMM's are trained using a novel feature, Spectral Flow Direction (SFD), and an improved multi-band harmonicity feature in addition to widely used Mel Frequency Cepstral Coefficients (MFCC's).",
keywords = "CLASSIFICATION, RETRIEVAL, MUSIC",
author = "Unal Zubari and Ozan, {Ezgi Can} and Acar, {Banu Oskay} and Tolga Ciloglu and Ersin Esen and Ates, {Tugrul K.} and Onur, {Duygu Oskay}",
year = "2010",
language = "English",
series = "European Signal Processing Conference",
publisher = "EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP",
pages = "85--89",
editor = "B Kleijn and J Larsen",
booktitle = "18TH European Signal Processing Conference (EUSIPCO-2010)",

}

RIS (suitable for import to EndNote) - Download

TY - GEN

T1 - Speech Detection on Broadcast Audio

AU - Zubari, Unal

AU - Ozan, Ezgi Can

AU - Acar, Banu Oskay

AU - Ciloglu, Tolga

AU - Esen, Ersin

AU - Ates, Tugrul K.

AU - Onur, Duygu Oskay

PY - 2010

Y1 - 2010

N2 - Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-speech via Gaussian Mixture Model (GMM) based classification. GMM's are trained using a novel feature, Spectral Flow Direction (SFD), and an improved multi-band harmonicity feature in addition to widely used Mel Frequency Cepstral Coefficients (MFCC's).

AB - Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-speech via Gaussian Mixture Model (GMM) based classification. GMM's are trained using a novel feature, Spectral Flow Direction (SFD), and an improved multi-band harmonicity feature in addition to widely used Mel Frequency Cepstral Coefficients (MFCC's).

KW - CLASSIFICATION

KW - RETRIEVAL

KW - MUSIC

M3 - Conference contribution

T3 - European Signal Processing Conference

SP - 85

EP - 89

BT - 18TH European Signal Processing Conference (EUSIPCO-2010)

A2 - Kleijn, B

A2 - Larsen, J

PB - EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP

CY - KESSARIANI

ER -