RegExpMiner: Automatically discovering frequently matching regular expressions

Abstract : Regular expressions (REs) are a widely used tool when con- sidering textual data. They are, for instance, commonly used as search templates, or to check the validity of a piece of text w.r.t. some formatting requirements. Such applications typically rely on very small amounts of hand-crafted REs. However, automatically constructing REs matching with a large amount of string examples (e.g., a set of attribute values in a database) would help characterizing the formatting rules underlying these strings with no a priori knowledge, and would allow envisaging new RE-based applications. We propose to formulate the problem of dis- covering such REs as a frequent pattern mining problem.
Complete list of metadatas

Cited literature [4 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01054922
Contributor : Mathieu Roche <>
Submitted on : Thursday, March 21, 2019 - 7:58:16 PM
Last modification on : Wednesday, September 18, 2019 - 4:04:04 PM
Long-term archiving on : Saturday, June 22, 2019 - 4:24:21 PM

File

FinalPosterDMNLP2014.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : lirmm-01054922, version 1

Citation

Julien Rabatel, Jérôme Azé, Pascal Poncelet, Mathieu Roche. RegExpMiner: Automatically discovering frequently matching regular expressions. ECML&PKDD, Sep 2014, Nancy, France. European Conference on Machine Learning (ECML) and Principles and Practice of Knowledge Discovery in Databases (PKDD), CEUR Workshop Proceedings (1202), pp.143-144, 2014. ⟨lirmm-01054922⟩

Share

Metrics

Record views

365

Files downloads

11