Skip to Main content Skip to Navigation
Poster communications

RegExpMiner: Automatically discovering frequently matching regular expressions

Abstract : Regular expressions (REs) are a widely used tool when con- sidering textual data. They are, for instance, commonly used as search templates, or to check the validity of a piece of text w.r.t. some formatting requirements. Such applications typically rely on very small amounts of hand-crafted REs. However, automatically constructing REs matching with a large amount of string examples (e.g., a set of attribute values in a database) would help characterizing the formatting rules underlying these strings with no a priori knowledge, and would allow envisaging new RE-based applications. We propose to formulate the problem of dis- covering such REs as a frequent pattern mining problem.
Complete list of metadata

Cited literature [4 references]  Display  Hide  Download
Contributor : Mathieu Roche <>
Submitted on : Thursday, March 21, 2019 - 7:58:16 PM
Last modification on : Tuesday, September 7, 2021 - 3:55:02 PM
Long-term archiving on: : Saturday, June 22, 2019 - 4:24:21 PM


Files produced by the author(s)


  • HAL Id : lirmm-01054922, version 1


Julien Rabatel, Jérôme Azé, Pascal Poncelet, Mathieu Roche. RegExpMiner: Automatically discovering frequently matching regular expressions. ECML&PKDD, Sep 2014, Nancy, France. European Conference on Machine Learning (ECML) and Principles and Practice of Knowledge Discovery in Databases (PKDD), CEUR Workshop Proceedings (1202), pp.143-144, 2014. ⟨lirmm-01054922⟩



Record views


Files downloads