TUTCRIS - Tampereen teknillinen yliopisto

TUTCRIS

Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks

Tutkimustuotosvertaisarvioitu

Standard

Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks. / Diment, Aleksandr; Fagerlund, Eemi; Benfield, Adrian; Virtanen, Tuomas.

2019 International Joint Conference on Neural Networks, IJCNN 2019. IEEE, 2019.

Tutkimustuotosvertaisarvioitu

Harvard

Diment, A, Fagerlund, E, Benfield, A & Virtanen, T 2019, Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks. julkaisussa 2019 International Joint Conference on Neural Networks, IJCNN 2019. IEEE, Budapest, Unkari, 14/07/19. https://doi.org/10.1109/IJCNN.2019.8851963

APA

Diment, A., Fagerlund, E., Benfield, A., & Virtanen, T. (2019). Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks. teoksessa 2019 International Joint Conference on Neural Networks, IJCNN 2019 IEEE. https://doi.org/10.1109/IJCNN.2019.8851963

Vancouver

Diment A, Fagerlund E, Benfield A, Virtanen T. Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks. julkaisussa 2019 International Joint Conference on Neural Networks, IJCNN 2019. IEEE. 2019 https://doi.org/10.1109/IJCNN.2019.8851963

Author

Diment, Aleksandr ; Fagerlund, Eemi ; Benfield, Adrian ; Virtanen, Tuomas. / Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks. 2019 International Joint Conference on Neural Networks, IJCNN 2019. IEEE, 2019.

Bibtex - Lataa

@inproceedings{37cd86b75b84459cbfe9bf18c0fc8281,
title = "Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks",
abstract = "A machine learning method for the automatic detection of pronunciation errors made by non-native speakers of English is proposed. It consists of training word-specific binary classifiers on a collected dataset of isolated words with possible pronunciation errors, typical for Finnish native speakers. The classifiers predict whether the typical error is present in the given word utterance. They operate on sequences of acoustic features, extracted from consecutive frames of an audio recording of a word utterance. The proposed architecture includes a convolutional neural network, a recurrent neural network, or a combination of the two. The optimal topology and hyperpa-rameters are obtained in a Bayesian optimisation setting using a tree-structured Parzen estimator. A dataset of 80 words uttered naturally by 120 speakers is collected. The performance of the proposed system, evaluated on a well-represented subset of the dataset, shows that it is capable of detecting pronunciation errors in most of the words (46/49) with high accuracy (mean accuracy gain over the zero rule 12.21 percent points).",
keywords = "Computer-assisted language learning, computer-assisted pronunciation training CNN, CRNN, GRU, pronunciation learning",
author = "Aleksandr Diment and Eemi Fagerlund and Adrian Benfield and Tuomas Virtanen",
note = "jufoid=58177",
year = "2019",
month = "7",
day = "1",
doi = "10.1109/IJCNN.2019.8851963",
language = "English",
publisher = "IEEE",
booktitle = "2019 International Joint Conference on Neural Networks, IJCNN 2019",

}

RIS (suitable for import to EndNote) - Lataa

TY - GEN

T1 - Detection of Typical Pronunciation Errors in Non-native English Speech Using Convolutional Recurrent Neural Networks

AU - Diment, Aleksandr

AU - Fagerlund, Eemi

AU - Benfield, Adrian

AU - Virtanen, Tuomas

N1 - jufoid=58177

PY - 2019/7/1

Y1 - 2019/7/1

N2 - A machine learning method for the automatic detection of pronunciation errors made by non-native speakers of English is proposed. It consists of training word-specific binary classifiers on a collected dataset of isolated words with possible pronunciation errors, typical for Finnish native speakers. The classifiers predict whether the typical error is present in the given word utterance. They operate on sequences of acoustic features, extracted from consecutive frames of an audio recording of a word utterance. The proposed architecture includes a convolutional neural network, a recurrent neural network, or a combination of the two. The optimal topology and hyperpa-rameters are obtained in a Bayesian optimisation setting using a tree-structured Parzen estimator. A dataset of 80 words uttered naturally by 120 speakers is collected. The performance of the proposed system, evaluated on a well-represented subset of the dataset, shows that it is capable of detecting pronunciation errors in most of the words (46/49) with high accuracy (mean accuracy gain over the zero rule 12.21 percent points).

AB - A machine learning method for the automatic detection of pronunciation errors made by non-native speakers of English is proposed. It consists of training word-specific binary classifiers on a collected dataset of isolated words with possible pronunciation errors, typical for Finnish native speakers. The classifiers predict whether the typical error is present in the given word utterance. They operate on sequences of acoustic features, extracted from consecutive frames of an audio recording of a word utterance. The proposed architecture includes a convolutional neural network, a recurrent neural network, or a combination of the two. The optimal topology and hyperpa-rameters are obtained in a Bayesian optimisation setting using a tree-structured Parzen estimator. A dataset of 80 words uttered naturally by 120 speakers is collected. The performance of the proposed system, evaluated on a well-represented subset of the dataset, shows that it is capable of detecting pronunciation errors in most of the words (46/49) with high accuracy (mean accuracy gain over the zero rule 12.21 percent points).

KW - Computer-assisted language learning

KW - computer-assisted pronunciation training CNN

KW - CRNN

KW - GRU

KW - pronunciation learning

U2 - 10.1109/IJCNN.2019.8851963

DO - 10.1109/IJCNN.2019.8851963

M3 - Conference contribution

BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019

PB - IEEE

ER -