How Statistical Information from the Web can Help Identify Named Entities
Résumé
This paper presents a Natural Language Processing (NLP) approach to filter Named Entities (NE) from a list of collocation candidates. The NE are defined as the names of 'People', 'Places', 'Organizations', 'Software', 'Illnesses', and so forth. The proposed method is based on statistical measures associated with Web resources to identify NE. Our method has three stages: (1) Building artificial prepositional collocations from Noun-Noun candidates; (2) Measuring the "relevance" of the resulting prepositional collocations using statistical methods (Web Mining); (3) Selecting prepositional collocations. The evaluation of Noun-Noun collocations from French and English corpora confirmed the relevance of our system.
Origine | Fichiers produits par l'(les) auteur(s) |
---|
Loading...