RegExpMiner: Automatically discovering frequently matching regular expressions
Abstract
Regular expressions (REs) are a widely used tool when con- sidering textual data. They are, for instance, commonly used as search templates, or to check the validity of a piece of text w.r.t. some formatting requirements. Such applications typically rely on very small amounts of hand-crafted REs. However, automatically constructing REs matching with a large amount of string examples (e.g., a set of attribute values in a database) would help characterizing the formatting rules underlying these strings with no a priori knowledge, and would allow envisaging new RE-based applications. We propose to formulate the problem of dis- covering such REs as a frequent pattern mining problem.
Origin | Files produced by the author(s) |
---|
Loading...