,
,
, The clueweb09 dataset, 2009.
, , 2014.
Database-friendly random projections: Johnson-lindenstrauss with binary coins, Journal of Computer and System Sciences, vol.66, issue.4, pp.671-687, 2003. ,
Efficient similarity search in sequence databases, Proceedings of the International Conference on Foundations of Data Organization and Algorithms (FODO), pp.69-84, 1993. ,
Fast algorithms for mining association rules in large databases, Proceedings of the International Conference on Very Large Data Bases (VLDB), pp.487-499, 1994. ,
Mining of massive datasets, 2012. ,
The ts-tree: Efficient time series search and retrieval, Proceedings of the International Conference on Extending Database Technology (EDBT), pp.252-263, 2008. ,
The ts-tree: efficient time series search and retrieval, Proceedings of the International Conference on Extending Database Technology (EDBT), pp.252-263, 2008. ,
Computing n-gram statistics in mapreduce, Proceedings of the International Conference on Extending Database Technology (EDBT), p.71, 2013. ,
Survey of text mining II clustering, classification, and retrieval, 2008. ,
The meaningful use of big data: four perspectives -four challenges, SIGMOD Rec, vol.40, issue.4, pp.56-60, 2011. ,
Beyond market baskets: Generalizing association rules to correlations, SIGMOD Rec, vol.26, issue.2, pp.265-276, 1997. ,
Indexing spatio-temporal trajectories with chebyshev polynomials, Proceedings of the International Conference on Management of Data (SIGMOD), pp.599-610, 2004. ,
iSAX 2.0: Indexing and mining one billion time series, Proceedings of the International Conference on Data Mining (ICDM), pp.58-67, 2010. ,
Beyond one billion time series: indexing and mining very large time series collections with iSAX2+, Knowledge and Information Systems (KAIS), vol.39, pp.123-151, 2014. ,
Beyond one billion time series: indexing and mining very large time series collections with iSAX2+, Knowledge and Information Systems (KAIS), vol.39, issue.1, pp.123-151, 2014. ,
Locally adaptive dimensionality reduction for indexing large time series databases, ACM Transactions on Database Systems (TODS), vol.27, issue.2, pp.188-228, 2002. ,
Efficient time series matching by wavelets, Proceedings of the International Conference on Data Engineering (ICDE), pp.126-133, 1999. ,
A survey on feature selection methods, Computers and Electrical Engineering, vol.40, issue.1, pp.16-28, 2014. ,
Similarity estimation techniques from rounding algorithms, Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing (STOC), pp.380-388, 2002. ,
Fast window correlations over uncooperative time series, Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp.743-749, 2005. ,
Cloud adoption practices and priorities survey report, 2015. ,
Elements of information theory, 2006. ,
Mapreduce: simplified data processing on large clusters, Commun. ACM, vol.51, issue.1, pp.107-113, 2008. ,
Differential privacy, International Colloquium on Automata, Languages and Programming (ICALP), pp.1-12, 2006. ,
Time-series data mining, ACM Comput. Surv, vol.45, issue.1, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-01577883
Fast subsequence matching in time-series databases, SIGMOD Rec, vol.23, issue.2, pp.419-429, 1994. ,
Fast subsequence matching in time-series databases, Proceedings of the International Conference on Management of Data (SIGMOD), pp.419-429, 1994. ,
Tiling databases, International Conference on Discovery Science, pp.278-289, 2004. ,
Unsupervised learning, Advanced Lectures on Machine Learning, pp.72-112, 2004. ,
Mining frequent patterns in data streams at multiple time granularities, 2002. ,
Geometric and combinatorial tiles in 0-1 data, Knowledge Discovery in Databases (PKDD), pp.173-184, 2004. ,
Similarity search in high dimensions via hashing, Proceedings of the International Conference on Very Large Data Bases (VLDB), pp.518-529, 1999. ,
Entropy and information theory, 2011. ,
Information retrieval: A survey, 2000. ,
An introduction to variable and feature selection, J. Mach. Learn. Res, vol.3, pp.1157-1182, 2003. ,
Mining frequent patterns without candidate generation, SIGMOD Rec, vol.29, 2000. ,
Data mining : concepts and techniques, 2012. ,
Finding low-entropy sets and trees from binary data, Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp.350-359, 2007. ,
Semigeometric tiling of event sequences, Machine Learning and Knowledge Discovery in Databases. ECML PKDD, pp.329-344, 2016. ,
, Bibliography -Part, vol.1
An overview on subgroup discovery: foundations and applications, Knowledge and Information Systems, vol.29, issue.3, pp.495-525, 2011. ,
Stable distributions, pseudorandom generators, embeddings and data stream computation, 41st Annual Symposium on Foundations of Computer Science (FOCS), pp.189-197, 2000. ,
Searching in one billion vectors: re-rank with source coding, ICASSP, 2011. ,
Extensions of Lipschitz mappings into a Hilbert space, Conference in Modern Analysis and Probability, vol.26, pp.189-206, 1984. ,
Exact indexing of dynamic time warping, Proceedings of the International Conference on Very Large Data Bases (VLDB), 2002. ,
Dimensionality reduction for fast similarity search in large time series databases, Knowledge and Information Systems (KAIS), vol.3, issue.3, pp.263-286, 2001. ,
Maximally informative k-itemsets and their efficient discovery, Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp.237-244, 2006. ,
Supervised machine learning: A review of classification techniques, Proceedings of International Conference on Emerging Artificial Intelligence Applications in Computer Engineering, pp.3-24, 2007. ,
Efficient search for approximate nearest neighbor in high dimensional spaces, Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing (STOC), pp.614-623, 1998. ,
Pfp: parallel fp-growth for query recommendation, Proceedings of the ACM Conf. on Recommender Systems (RecSys), pp.107-114, 2008. ,
A symbolic representation of time series, with implications for streaming algorithms, Proceedings of the International Conference on Management of Data (SIGMOD), 2003. ,
Experiencing sax: A novel symbolic representation of time series, Data Min. Knowl. Discov, vol.15, issue.2, pp.107-144, 2007. ,
Regime shifts in streams: Real-time forecasting of co-evolving time sequences, Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp.1045-1054, 2016. ,
Mind the gap: Large-scale frequent sequence mining, Proceedings of International Conference on Management of Data (SIGMOD), pp.797-808, 2013. ,
Frequent itemset mining for big data, IEEE International Conference on Big Data, pp.111-118, 2013. ,
Fast approximate correlation for massive timeseries data, Proceedings of the International Conference on Management of Data (SIG-MOD), pp.171-182, 2010. ,
Data series management: The road to big sequence analytics, SIGMOD Record, vol.44, issue.2, pp.47-52, 2015. ,
Big sequence management: A glimpse of the past, the present, and the future, SOFSEM, 2016. ,
Streaming pattern discovery in multiple time-series, Proceedings of the International Conference on Very Large Data Bases (VLDB), pp.697-708, 2005. ,
Optimal multi-scale patterns in time series streams, Proceedings of the International Conference on Management of Data (SIGMOD), pp.647-658, 2006. ,
Fast relevance discovery in time series, Proceedings of the International Conference on Data Mining (ICDM), pp.1016-1020, 2006. ,
Searching and mining trillions of time series subsequences under dynamic time warping, Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2012. ,
Parma: a parallel randomized algorithm for approximate association rules mining in mapreduce, Proceedings of the International Conference on Information and Knowledge Management (CIKM), pp.85-94, 2012. ,
Stream monitoring under the time warping distance, Proceedings of the International Conference on Data Engineering (ICDE), pp.1046-1055, 2007. ,
An efficient algorithm for mining association rules in large databases, Proceedings of the International Conference on Very Large Data Bases (VLDB), pp.432-444, 1995. ,
High Performance Discovery in Time series, Techniques and Case Studies, 2004. ,
, Bibliography -Part, vol.1
isax: Indexing and mining terabyte sized time series, Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp.623-631, 2008. ,
isax: Disk-aware mining and indexing of massive time series datasets, Data Min. Knowl. Discov, vol.19, issue.1, pp.24-57, 2009. ,
iSAX: Indexing and mining terabyte sized time series, Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp.623-631, 2008. ,
Parallel and distributed frequent pattern mining in large databases, Proceedings of the IEEE International Conference on High Performance Computing and Communications (HPCC), pp.407-414, 2009. ,
Probably the best itemsets, Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp.293-302, 2010. ,
A regression-based temporal pattern mining scheme for data streams, Proceedings of the International Conference on Very Large Data Bases (VLDB), pp.93-104, 2003. ,
A data-adaptive and dynamic segmentation index for whole matching on time series, vol.6, pp.793-804, 2013. ,
Hadoop : the definitive guide, 2012. ,
Local correlation detection with linearity enhancement in streaming data, Proceedings of the International Conference on Information and Knowledge Management (CIKM), pp.309-318, 2013. ,
Matrix profile I: all pairs similarity joins for time series: A unifying view that includes motifs, discords and shapelets, Proceedings of the International Conference on Data Mining (ICDM), pp.1317-1322, 2016. ,
Spark: Cluster computing with working sets, Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, pp.10-10, 2010. ,
Spark: Cluster computing with working sets, Proceedings of the 2Nd USENIX Conf. on Hot Topics in Cloud Computing, pp.10-10, 2010. ,
Discovering highly informative feature sets from data streams, Database and Expert Systems Applications, pp.91-104, 2010. ,
Indexing for interactive exploration of big data series, Proceedings of the International Conference on Management of Data (SIGMOD), pp.1555-1566, 2014. ,
ADS: the adaptive data series index, Stratos Idreos, and Themis Palpanas, vol.25, pp.843-866, 2016. ,
Massively distributed time series indexing and querying, IEEE Transactions on Knowledge and Data Engineering (TKDE), 2019. ,
Spark-parsketch: A massively distributed indexing of time series datasets, Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), pp.1951-1954, 2018. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01886760
Parcorr: efficient parallel methods to identify similar time series pairs across sliding windows, Data Mining and Knowledge Discovery (DMKD), vol.32, issue.5, pp.1481-1507, 2018. ,
Dpisax: Massively distributed partitioned isax, Proceedings of the International Conference on Data Mining (ICDM), pp.1135-1140, 2017. ,
A highly scalable parallel algorithm for maximally informative k-itemset mining, Knowledge and Information Systems (KAIS), vol.50, issue.1, pp.1-26, 2017. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01288571
Fast parallel mining of maximally informative k-itemsets in big data, Proceedings of the International Conference on Data Mining (ICDM), pp.359-368, 2015. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01187275
Data placement in massively distributed environments for fast parallel mining of frequent itemsets. Knowledge and Information Systems (KAIS), vol.53, pp.207-237, 2017. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01620383
A differentially private index for range query processing in clouds, Proceedings of IEEE International Conference on Data Engineering (ICDE), pp.208-216, 2018. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01886725
Privacy-preserving top-k query processing in distributed systems, Proceedings of the International European Conference on Parallel and Distributed Computing, pp.281-292, 2018. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01886160
Answering top-k queries over outsourced sensitive data in the cloud, Proceedings of the International Conference on Database and Expert Systems Applications (DEXA), pp.218-231, 2018. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01886164
Fphadoop: Efficient processing of skewed mapreduce jobs, Information Systems, vol.60, pp.69-84, 2016. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01377715
Fp-hadoop: Efficient execution of parallel jobs over skewed data, Proceedings of the VLDB Endowment (PVLDB), vol.8, pp.1856-1859, 2015. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01162362
Dynamic workload-based partitioning algorithms for continuously growing databases, Trans. Large-Scale Data-and Knowledge-Centered Systems, vol.12, pp.105-128, 2013. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00906966
Profile diversity for query processing using user recommendations, Information Systems, vol.48, pp.44-63, 2015. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01079523
Best position algorithms for efficient top-k query processing, Information Systems, vol.36, issue.6, pp.973-989, 2011. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00607882
ASAP top-k query processing in unstructured P2P systems, Proceedings of the IEEE International Conference on Peer-to-Peer Computing (P2P), pp.1-10, 2010. ,
Gérôme Canals, and Stéphane Laurière. P2P logging and timestamping for reconciliation, Proceedings of the VLDB Endowment (PVLDB), vol.1, pp.1420-1423, 2008. ,
Replication in dhts using dynamic groups, Trans. Large-Scale Data-and KnowledgeCentered Systems, vol.3, pp.1-19, 2011. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00607915
Entity resolution for probabilistic data, Information Sciences, vol.277, pp.492-511, 2014. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00879631
Entity resolution for distributed probabilistic data. Distributed and Parallel Databases (DAPD), vol.31, pp.509-542, 2013. ,
Fast and exact mining of probabilistic data streams, Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (PKDD), pp.493-508, 2013. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00838618
Efficient evaluation of SUM queries over probabilistic data, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol.25, issue.4, pp.764-775, 2013. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00652293
Best position algorithms for top-k queries, Proceedings of the International Conference on Very Large Data Bases (VLDB), pp.495-506, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00378836
Processing top-k queries in distributed hash tables, Proceedings of the International European Conference on Parallel and Distributed Computing, pp.489-502, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00378864
Reducing network traffic in unstructured P2P systems using top-k queries. Distributed and Parallel Databases (DAPD), vol.19, pp.67-86, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-00416447
Assoon-as-possible top-k query processing in P2P systems, Trans. Large-Scale Data-and Knowledge-Centered Systems, vol.9, pp.1-27, 2013. ,
Dhtjoin: processing continuous join queries using DHT networks. Distributed and Parallel Databases (DAPD), vol.26, pp.291-317, 2009. ,
URL : https://hal.archives-ouvertes.fr/inria-00410473
Distributed processing of continuous join queries using DHT networks, Proceedings of the EDBT/ICDT Workshops, pp.34-41, 2009. ,
Efficient processing of continuous join queries using distributed hash tables, Proceedings of the International European Conference on Parallel and Distributed Computing, pp.632-641, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00368874
Data currency in replicated dhts, Proceedings of International Conference on Management of Data (SIGMOD), pp.211-222, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00378860