Ristad and Yianilos 1997 Learning String Edit Distance
Citation
Ristad, E.S. and Yianilos, P.N. Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20: 522--532, 1998.
Online
- Paper web-site by Yianilos.
Summary
In this paper the authors describe a stochastic HMM model for learning a String Edit Distance function, as well as an efficient EM variant for learning edit costs. The authors then present a stochastic solution to the problem of String Classification, in which the classification is based on the similarity of an observed string, , to an underlying prototype, , from a class, . The paper describes an efficient algorithm for inducing a joint probability model from a corpus, and this model is used to classify new strings. Finally, the described techniques are applied to the problem of learning Word Pronunciation in conversational speech.
Stochastic Model for String Edit Distance
The distance between the strings and is modeled as a memoryless stochastic transduction of edit operations, including deletions, insertions, and substitutions. Using this model, two distance functions are defined: (1) The Viterbi edit distance is defined by most likely transduction between two strings, and (2) the stochastic edit distance...