An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings
Clicks: 30
ID: 282586
2021
Article Quality & Performance Metrics
Overall Quality
Improving Quality
0.0
/100
Combines engagement data with AI-assessed academic quality
Reader Engagement
Emerging Content
2.1
/100
7 views
7 readers
Trending
AI Quality Assessment
Not analyzed
Abstract
Many mispronunciation detection and diagnosis (MD&D) research approaches try
to exploit both the acoustic and linguistic features as input. Yet the
improvement of the performance is limited, partially due to the shortage of
large amount annotated training data at the phoneme level. Phonetic embeddings,
extracted from ASR models trained with huge amount of word level annotations,
can serve as a good representation of the content of input speech, in a
noise-robust and speaker-independent manner. These embeddings, when used as
implicit phonetic supplementary information, can alleviate the data shortage of
explicit phoneme annotations. We propose to utilize Acoustic, Phonetic and
Linguistic (APL) embedding features jointly for building a more powerful MD&D
system. Experimental results obtained on the L2-ARCTIC database show the
proposed approach outperforms the baseline by 9.93%, 10.13% and 6.17% on the
detection accuracy, diagnosis error rate and the F-measure, respectively.
| Reference Key |
wu2021an
Use this key to autocite in the manuscript while using
SciMatic Manuscript Manager or Thesis Manager
|
|---|---|
| Authors | Wenxuan Ye; Shaoguang Mao; Frank Soong; Wenshan Wu; Yan Xia; Jonathan Tien; Zhiyong Wu |
| Journal | arXiv |
| Year | 2021 |
| DOI |
DOI not found
|
| URL | |
| Keywords |
Citations
No citations found. To add a citation, contact the admin at info@scimatic.org
Comments
No comments yet. Be the first to comment on this article.