Scientific Workflow Scheduling with Provenance Data in a Multisite Cloud

Ji Liu 1 Esther Pacitti 1, 2 Patrick Valduriez 1, 2 Marta Mattoso 3
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Recently, some Scientific Workflow Management Systems (SWfMSs) with provenance support (e.g. Chiron) have been deployed in the cloud. However, they typically use a single cloud site. In this paper, we consider a multisite cloud, where the data and computing resources are distributed at different sites (possibly in different regions). Based on a multisite architecture of SWfMS, i.e. multisite Chiron, and its provenance model, we propose a multisite task scheduling algorithm that considers the time to generate provenance data. We performed an extensive experimental evaluation of our algorithm using Microsoft Azure multisite cloud and two real-life scientific workflows (Buzz and Montage). The results show that our scheduling algorithm is up to 49.6% better than baseline algorithms in terms of total execution time.
Complete list of metadatas

Cited literature [32 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01620224
Contributor : Patrick Valduriez <>
Submitted on : Friday, October 20, 2017 - 11:51:28 AM
Last modification on : Friday, March 15, 2019 - 1:15:10 AM
Long-term archiving on : Sunday, January 21, 2018 - 2:16:11 PM

File

TLDKS.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Ji Liu, Esther Pacitti, Patrick Valduriez, Marta Mattoso. Scientific Workflow Scheduling with Provenance Data in a Multisite Cloud. Transactions on Large-Scale Data- and Knowledge-Centered Systems, Springer Berlin / Heidelberg, 2017, 33, pp.80-112. ⟨10.1109/IPDPS.2007.370305⟩. ⟨lirmm-01620224⟩

Share

Metrics

Record views

402

Files downloads

246