Handling scalable approximate queries over NoSQL graph databases: Cypherf and the Fuzzy4S framework
Résumé
NoSQL databases are currently often considered for Big Data solutions as they offer efficient solutions for volume and velocity issues and can manage some of complex data (e.g., documents, graphs). However, fuzzy approaches are often not efficient on such frameworks. Thus this article introduces a novel approach to define and run approximate queries over NoSQL graph databases using Scala by proposing the Fuzzy4S framework and the Cypherf fuzzy declarative query language. NoSQL Graph databases are currently gaining more and more interest and are applied in many real world applications. The Fuzzy4S framework is defined with an open DSL (Domain Specific Language) allowing it to define scalable approximate queries at an abstract level. Cypherf is an extension of Cypher which runs over the Neo4J NoSQL graph databases. This work consists of a complete approach embedding the whole chain from end-user declarative query level to implementation issues within the database engine. We provide both the formal definitions for defining approximate graph NoSQL queries and the experimental results which demonstrate the interest and efficiency of our proposition.