Pre-processing and Indexing techniques for Constellation Queries in Big Data

Amir Khatibi; Fabio Porto; Joao Guilherme Rittmeyer; Eduardo Ogasawara; Patrick Valduriez; Dennis Shasha

doi:10.1007/978-3-319-64283-3_12

Communication Dans Un Congrès Année : 2017

Pre-processing and Indexing techniques for Constellation Queries in Big Data

(1) , (1) , (1) , (2) , (3, 4) , (5)

1
2
3
4
5

Amir Khatibi

Fonction : Auteur

Laboratorio Nacional de Computação Cientifica [Rio de Janeiro]

Fabio Porto

Fonction : Auteur
PersonId : 932292
ORCID : 0000-0002-4597-4832

Laboratorio Nacional de Computação Cientifica [Rio de Janeiro]

Joao Guilherme Rittmeyer

Fonction : Auteur

Laboratorio Nacional de Computação Cientifica [Rio de Janeiro]

Eduardo Ogasawara

Fonction : Auteur
PersonId : 913660
ORCID : 0000-0002-0466-0626

Centro Federal de Educação Tecnológica Celso Suckow da Fonseca [Rio de Janeiro]

Patrick Valduriez

Fonction : Auteur
PersonId : 172604
IdHAL : patrick-valduriez
ORCID : 0000-0001-6506-7538
IdRef : 028314417

Scientific Data Management

Institut de Biologie Computationnelle

Dennis Shasha

Fonction : Auteur

Courant Institute of Mathematical Sciences [New York]

Résumé

Geometric patterns are defined by a spatial distribution of a set of objects. They can be found in many spatial datasets as in seismic, astronomy , and transportation. A particular interesting geometric pattern is exhibited by the Einstein cross, which is an astronomical phenomenon in which a single quasar is observed as four distinct sky objects when captured by earth telescopes. Finding such crosses, as well as other geometric patterns, collectively referred to as constellation queries, is a challenging problem as the potential number of sets of elements that compose shapes is exponentially large in the size of the dataset and the query pattern. In this paper we propose algorithms to optimize the computation of constellation queries. Our techniques involve pre-processing the query to reduce its di-mensionality as well as indexing the data to fasten stars neighboring computation using a PH-tree. We have implemented our techniques in Spark and evaluated our techniques by a series of experiments. The PH-tree indexing showed very good results and guarantees query answer completeness.

Mots clés

PH-tree Indexing Query Pre-Processing SQL extension Dataset Pre-Processing Constellation Queries Geometric Shapes

Domaines

Base de données [cs.DB]

Fichier principal

ConstellationQuery_Dexa.pdf (297.71 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Patrick Valduriez : Connectez-vous pour contacter le contributeur

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01620398

Soumis le : vendredi 20 octobre 2017-15:04:19

Dernière modification le : jeudi 1 février 2024-10:04:51

Archivage à long terme le : dimanche 21 janvier 2018-13:23:30

Dates et versions

lirmm-01620398 , version 1 (20-10-2017)

Identifiants

HAL Id : lirmm-01620398 , version 1
DOI : 10.1007/978-3-319-64283-3_12

Citer

Amir Khatibi, Fabio Porto, Joao Guilherme Rittmeyer, Eduardo Ogasawara, Patrick Valduriez, et al.. Pre-processing and Indexing techniques for Constellation Queries in Big Data. DaWaK: Data Warehousing and Knowledge Discovery, Aug 2017, Lyon, France. pp.164-172, ⟨10.1007/978-3-319-64283-3_12⟩. ⟨lirmm-01620398⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INRA IRISA ZENITH LIRMM INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC MIPS UNIV-MONTPELLIER UNIV-RENNES INRAE UR1-MATH-NUM

353 Consultations

588 Téléchargements

Pre-processing and Indexing techniques for Constellation Queries in Big Data

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager