Identifying Polysemous Words and Inferring Sense Glosses in a Semantic Network

Maxime Chapuis; Mathieu Lafourcade

Communication Dans Un Congrès Année : 2017

Identifying Polysemous Words and Inferring Sense Glosses in a Semantic Network

(1, 2) , (3)

1
2
3

Maxime Chapuis

Fonction : Auteur
PersonId : 1149467
ORCID : 0000-0003-0876-8711
IdRef : 268649286

CERPHI

Institut d’Histoire des Représentations et des Idées dans les Modernités

Mathieu Lafourcade

Fonction : Auteur
PersonId : 172381
IdHAL : mathieu-lafourcade
ORCID : 0000-0003-2832-2143

Exploration et exploitation de données textuelles

Résumé

Introduction
The present paper aims at detecting polysemous words from their hypernyms. For instance, a native speaker knowing that the French word frégate (frigate) is a ship and a bird can easily guess that frégate is polysemous. Indeed, it is difficult to conceive something being both a ship and a bird at the same time. We can say that those two hypernyms are "incompatible". If one had a list of all incompatible hypernyms (which will be referred as incompatibility rules later in this paper), one could easily detect polysemous words. Is it possible to create such a list ? Can it be done automatically ? To answer these questions we experimented on the French lexical-semantic network JeuxDeMots, Lafourcade (2007), which a free and open resource. Identifying polysemous words is crucial in order to understand a text. It is usually done by detecting high density components in co-occurrence graphs created from large corpora, as in Véronis (2003). Similar methods have been used by Dorow and Widdows (2003) and Ferret (2004) to discover word senses also in corpora. To detect the different dense areas of their graphs, Dorow and Widdows (2003) used the Markov Cluster Algorithm, van Dongen (2000). These methods are very effective, but they highly depend on the corpora used to create the graphs which might induce many biases. To choose the proper glosses for naming the different word senses, Dorow and Widdows (2003) used the hypernyms present in the lexical network WordNet, Fellbaum (1998). WordNet is also used by Ferret (2004) to evaluate his results. We experimented our approach on the French lexical-semantic network JeuxDeMots, and there is no other complete enough french resources equivalent to WordNet to automatically compare our results to. Hence, we had to rely on some manual evaluation. In this paper, we will first present the JeuxDeMots network and some of its specificities. Then, we will detail the method we used (a) for generating list of incompatible hypernym and then (b) for inferring glosses for naming word senses, followed by some evaluations.

Domaines

Traitement du texte et du document Informatique et langage [cs.CL]

Fichier principal

identifying-polysemous-words-vfinal.pdf (136.74 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Mathieu Lafourcade : Connectez-vous pour contacter le contributeur

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01763423

Soumis le : mercredi 11 avril 2018-10:13:09

Dernière modification le : mardi 12 novembre 2024-15:20:07

Dates et versions

lirmm-01763423 , version 1 (11-04-2018)

Identifiants

HAL Id : lirmm-01763423 , version 1

Citer

Maxime Chapuis, Mathieu Lafourcade. Identifying Polysemous Words and Inferring Sense Glosses in a Semantic Network. IWCS 2017 - 12th International Conference on Computational Semantics, Sep 2017, Montpellier, France. ⟨lirmm-01763423⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-ST-ETIENNE ENS-LYON UNIV-LYON3 PRES_CLERMONT CNRS UNIV-LYON2 CERHAC TEXTE LIRMM MIPS UNIV-MONTPELLIER IHRIM UDL

188 Consultations

128 Téléchargements

Identifying Polysemous Words and Inferring Sense Glosses in a Semantic Network

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager