Maximizing accuracy of forced alignment for spontaneous child speech

Robert Fromont; Lynn Clark; Joshua Wilson Black; Margaret Blackwood

doi:10.34842/shrr-sv10

Article

Maximizing accuracy of forced alignment for spontaneous child speech

Authors

Robert Fromont Robert Fromont ORCID profile. (opens in new tab) , robert.fromont@canterbury.ac.nz(compose email, opens in email app.), New Zealand Institute of Language, Brain and Behaviour, University of Canterbury (opens in new tab)
Lynn Clark Lynn Clark ORCID profile. (opens in new tab) , New Zealand Institute of Language, Brain and Behaviour, University of Canterbury (opens in new tab)
Joshua Wilson Black Joshua Wilson Black ORCID profile. (opens in new tab) , New Zealand Institute of Language, Brain and Behaviour, University of Canterbury (opens in new tab)
Margaret Blackwood, New Zealand Institute of Language, Brain and Behaviour, University of Canterbury (opens in new tab)

Abstract

Sociophonetic study of large speech corpora generally requires the use of forced alignment - the automatic process of determining the start and end time of each speech sound within the recording - in order to facilitate large-scale automated extraction of acoustic measurements of targeted vowels or consonants. There is an extensive literature evaluating alignment accuracy of a number of forced alignment tools and procedures, processing speech data from a range of languages and dialects. In general, these evaluations use typical adult speech data, often elicited in a controlled laboratory environment. There is little literature on the effectiveness of forced alignment systems on child speech, and none on speech elicited in field environments. This presents a problem for research at the intersection of language acquisition and sociophonetics as there is no established best practice for automatically aligning child speech. Child speech presents special challenges to automated tools, as it includes more variation in speech sounds and voice quality, and non-standard pronunciation and prosody. We evaluated two toolkits, Kaldi via the Montreal Forced Aligner (MFA), and the Hidden Markov Model Toolkit (HTK), using different configurations to force align non-rhotic child speech elicited in a preschool environment. Against many of our expectations, we found that MFA, using rhotic acoustic models pre-trained on adult speech, performed best. This paper provides a clear methodology for other researchers in sociophonetics to evaluate the success or otherwise of phonetic alignment.

Keywords:

child speech
language acquisition
sociophonetics
speech corpora
forced alignment

Files:

Published on
1 September 2023

Peer Reviewed

License

Creative Commons Attribution-Noncommercial 4.0 International

Authors

Robert Fromont Robert Fromont ORCID profile. (opens in new tab) , robert.fromont@canterbury.ac.nz(compose email, opens in email app.), New Zealand Institute of Language, Brain and Behaviour, University of Canterbury (opens in new tab)
Lynn Clark Lynn Clark ORCID profile. (opens in new tab) , New Zealand Institute of Language, Brain and Behaviour, University of Canterbury (opens in new tab)
Joshua Wilson Black Joshua Wilson Black ORCID profile. (opens in new tab) , New Zealand Institute of Language, Brain and Behaviour, University of Canterbury (opens in new tab)
Margaret Blackwood, New Zealand Institute of Language, Brain and Behaviour, University of Canterbury (opens in new tab)

Share

Files

Issue

Volume 3 • Issue 1 • 2023

Identifiers

https://doi.org/10.34842/shrr-sv10 (external link, opens in new tab).

Publication details

Pages: 182-210
Article Number: 7
Accepted on: 4 August 2023

672 - Maximizing accuracy of forced alignment for spontaneous child speech

File Checksums (MD5)

2023_09_01.pdf: 922767e7b0501632ae4cc9dbdc74d644

Maximizing accuracy of forced alignment for spontaneous child speech

Abstract

Keywords:

Files:

Authors

Share

Files

Issue

Identifiers

Publication details

File Checksums (MD5)

Harvard-style Citation

Vancouver-style Citation

APA-style Citation

Non Specialist Summary