ACTA Scientiarum Naturalium Universitatis Pekinensis
A Study of Articulatory Features Based Detection of Mandrain Pronunciation Erroneous Tendency for Automatic Annotation
WEI Xing, WANG Wei, CHEN Jingping, XIE Yanlu†, ZHANG Jinsong
Advanced Innovation Center for Language Resource and Intelligence Research Funds of State Language Commission, School of Information Science, Beijing Language and Culture University, Beijing 100083; † Corresponding author, E-mail: xieyanlu@blcu.edu.cn
Abstract For the purpose of relieving the time cost and inconformity in annotation, the authors use an articulatory features based mispronunciation detection system to give an Top-n feedback and use this feedback to assist manual annotation. As a result, the consistency rate of phoneme labels in proposed system increases from 80.7% to 92.48%. In addition, the time cost for annotating each sentence reduce from 10 to 3 minutes. The results indicate that proposed automatic annotation system is practical, and there is also a room for further improvement. Key words articulatory features (AFS); pronunciation erroneous tendency (PET); automatic annotation
近年来, 随着机器学习和计算机技术的发展,自动语音识别(ASR)技术成为当前研究热点之一。有标注的语料库在语音合成、语音识别、语音分析等领域发挥着日益重要的作用。为大规模语音语料库添加标注是一项需要投入大量人力资源的任务,长时间的连续工作不可避免地造成标注人的疲劳和倦怠, 同时标注人所接受的语音学专业训练水平、对语音学知识的把握以及生理、心理因素的共同影
[1]响, 都会造成主观误差, 影响标注结果 。因此,必须发展语音自动标注系统。语音语料库的标注方法一般有自动标注和人工
标注两种, 或两者相结合的方法, 例如先用ASR系统对语音数据进行自动标注, 然后再进行人工校正[2]。朱维彬等[1]认为, 语音自动标注系统有两条技术路线: 1) 基于统计模型, 基础是样本量足够大的附手工标注信息的语料库; 2) 基于语言学模型,出发点是由语言声学知识总结的先验性规则。
由于自动标注的准确性不如人工标注, 现有的ASR 系统无法实现语音语料库的全自动标注, 标注工作往往通过自动标注和人工标注相结合的方式完成。对未标注的语料库, 一般先用自动标注的方法标注音素层信息, 再由专业标注人员进行校对和