PP Attachment Ambiguity Resolution with Corpus-Based Pattern Distributions and Lexical Signatures
Résumé
In this paper, we propose a method combining unsupervised learning of lexical frequencies with semantic information aiming at improving PP attachment ambiguity resolution. Using the output of a robust parser, i.e. the set of all possible attachments for a given sentence, we query the Web and obtain statistical information about the frequencies of the attachments distributions as well as lexical signatures of the terms on the patterns. All this information is used to weight the dependencies yielded by the parser.
Domaines
Traitement du texte et du document
Loading...