AbstractIn this paper we describe an approach that both creates crosslingual acoustic monophone model sets for speech recognition tasks and objectively predicts their performance without target-language speech data or acoustic measurement techniques. This strategy is based on a series of linguistic metrics characterizing the articulatory phonetic and phonological distances of target-language phonemes from source-language phonemes. We term these algorithms the Combined Phonetic and Phonological Crosslingual Distance (CPP-CD) metric and the Combined Phonetic and Phonological Crosslingual Prediction (CPP-CP) metric. The particular motivations for this project are the current unavailability and often prohibitively high production cost of speech databases for many strategically important low- and middle-density languages. First, we describe the CPP-CD approach and compare the performance of CPP-CD-specified models to both native language models and crosslingual models selected by the Bhattacharyya acoustic-model distance metric in automatic speech recognition (ASR) experiments. Results confirm that the CPP-CD approach nearly matches those achieved by the acoustic distance metric. We then test the CPP-CP algorithm on the CPP-CD models by comparing the CPP-CP scores to the recognition phoneme error rates. Based on this comparison, we conclude that the CPP-CP algorithm is a reliable indicator of crosslingual model performance in speech recognition tasks.