RegExpMiner: Automatically discovering frequently matching regular expressions

Abstract : Regular expressions (REs) are a widely used tool when con- sidering textual data. They are, for instance, commonly used as search templates, or to check the validity of a piece of text w.r.t. some formatting requirements. Such applications typically rely on very small amounts of hand-crafted REs. However, automatically constructing REs matching with a large amount of string examples (e.g., a set of attribute values in a database) would help characterizing the formatting rules underlying these strings with no a priori knowledge, and would allow envisaging new RE-based applications. We propose to formulate the problem of dis- covering such REs as a frequent pattern mining problem.
Type de document :
Poster
ECML PKDD: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2014, Nancy, France. CEUR, 1st International Workshop on Interactions between Data Mining and Natural Language Processing co-located with ECML PKDD'2014: The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 1202, pp.143-144, 2014, 〈http://ecmlpkdd2014.loria.fr〉
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01054922
Contributeur : Mathieu Roche <>
Soumis le : samedi 9 août 2014 - 23:03:52
Dernière modification le : jeudi 11 janvier 2018 - 06:27:21

Identifiants

  • HAL Id : lirmm-01054922, version 1

Citation

Julien Rabatel, Jérôme Azé, Pascal Poncelet, Mathieu Roche. RegExpMiner: Automatically discovering frequently matching regular expressions. ECML PKDD: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2014, Nancy, France. CEUR, 1st International Workshop on Interactions between Data Mining and Natural Language Processing co-located with ECML PKDD'2014: The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 1202, pp.143-144, 2014, 〈http://ecmlpkdd2014.loria.fr〉. 〈lirmm-01054922〉

Partager

Métriques

Consultations de la notice

177