Towards Improved Assessment of Phonotactic Information for Automatic Language Identification
Martin, Terry, Wong, Eddie, & Sridharan, Sridha (2006) Towards Improved Assessment of Phonotactic Information for Automatic Language Identification. In Berkling, K. & Torres, Carrasquillo P. (Eds.) 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop, 28-30 June, San Juan, Puerto Rico.
Phonotactic modelling, typically in the form of a PPRLM system, forms a key component in state-of-the-art Language Identification (LID) systems. Given the objective of PPRLM systems is to capture as accurately as possible the phonotactics which characterise a language, it is assumed that the minimisation of Phone Error Rate (PER) is a precursor to achieving this effectively. In this paper we examine the relevance of PER as a metric for determining eventual LID performance. In order to conduct this investigation we make use of the CallHome corpus, based on the premise it provides a better representation for the style of discourse and channel conditions encountered in the Conversational Telephone Speech (CTS), which is now the focus of current NIST LID evaluations. Using CallHome instead of the OGI-MLTS corpus to train phone recognisers, we obtained significantly improved results, with an average improvement of approximately 6% absolute across the 30, 10 and 3 seconds tasks for the NIST 1996 and 2003 evaluations. We also examine the impact of tuning the individual front-end recognisers, on both the resultant PER of other languages and against the resultant LID performance. We find that PER has a number of limitations in indicating both the degree and direction of changes to LID performance. Accordingly, we propose a new metric which is better suited for forecasting the impact on LID performance when the phone recogniser front-end is modified.
Impact and interest:
Citation counts are sourced monthly from and citation databases.
These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.
|Item Type:||Conference Paper|
|Divisions:||Past > QUT Faculties & Divisions > Faculty of Built Environment and Engineering|
|Copyright Owner:||Copyright 2006 (please consult authors)|
Paper available at IEEE site.
Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
|Deposited On:||16 Oct 2007 00:00|
|Last Modified:||29 Feb 2012 13:26|
Repository Staff Only: item control page