Tracking of Online Parameter Fine-tuning in Scientific Workflows

Renan Souza 1 Vitor Silva 1 José Camata 1 Alvaro Coutinho 1 Patrick Valduriez 2 Marta Mattoso 1
2 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : EXTENDED ABSTRACT In typical large-scale scientific applications, several parameters of complex computational models have to be predefined in a simulation, each with a wide range of possible values. Listing all possible combinations of parameters and exhaustively trying them all is nearly impossible even in extreme-scale High Performance Computing (HPC). There may be a huge number of possible combinations and processing each one may take several hours or days, making the whole computation last for weeks or months. Typically, after the initial set ups, the scientist starts the computation and occasionally fine-tunes specific parameters based on intermediate result analysis. The term " human-in-the-loop " is used when computational scientists can actively participate in the computational process. Specific adaptations can generate an important improvement on performance, resource consumption, and quality of results [2]. To allow for online human adaptation, dynamic workflow solutions are required. Most existing workflow solutions do not allow for online human adaptations, which is considered a future research challenge in a recent survey [4]. Chiron[3], WorkWays [7], and Copernicus [8] are a few exceptions that allow for online data adaptation. Parameter tuning – the subject of research of this work – is only one among many other types of adaptations possible in human-adapted workflows in HPC [6]. Registering the adaptations is essential to track and analyze the effects of fine-tunings. In [1], the authors discuss past, present and future of scientific workflows, and as a future issue they mention that "monitoring and logging will be enhanced with more interactive components for intermediate stages of active workflows." We did not find any work that registers workflow adaption in logs or in provenance databases. This work aims at capturing and registering such human adaptation data (e.g., values before and after a specific parameter fine-tuning; reason for the tuning), relating them with other relevant data (e.g., domain-specific strategic values and execution data), and allowing all these data to be efficiently integrated. This contributes for online data analysis and data-driven decisions (e.g., how a specific user action impacted the processing time), helping to put humans in the online loop of large-scale scientific computing. Also, recording those adaptations contributes to the results' reliability and reproducibility. We developed DfAdapter [5], a tool that collects human adaptations in the dataflow, while the workflow runs. It controls and stores specific parameter-tunings in a provenance database, relating the human adaptation data with data for: domain, dataflow provenance, execution, and performance. As shown in Figure 1, initially DfAdapter registers the user Bob, who is going to adapt the dataflow; then it registers identifiers of the current state of the workflow (e.g. step i of the loop). To track the tunings, it receives from Bob the set of parameters, e.g. attr5 to be modified to "val5" into Dataset2. Then, DfAdapter modifies the values in Dataset2. Finally, it registers the iteration counter, the execution state at the adaptation moment, the dataset, values before and after, and the current wall-time all in the provenance database. Relevant insight is obtained with visualizations complemented by tracking queries like: "List all Bob's tunings correlating with time step" or "Avg. of values 10 iterations before DfAdapter can be coupled to a workflow and after the tunings". managed by a parallel workflow management system or by a workflow defined using an HPC library, or even a script. DfAdapter works as a debugging tool on an instrumented code. Figure 1. Tuning parameters in a dataset in a dataflow.
Type de document :
Communication dans un congrès
Workflows in Support of Large-Scale Science (WORKS), in conjunction with ACM/IEEE Supercomputing., Nov 2017, Denver, United States. 2017
Liste complète des métadonnées

Littérature citée [8 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01620974
Contributeur : Patrick Valduriez <>
Soumis le : dimanche 22 octobre 2017 - 16:27:29
Dernière modification le : jeudi 24 mai 2018 - 15:59:21
Document(s) archivé(s) le : mardi 23 janvier 2018 - 12:27:03

Fichier

WORKS'17-1page-ack.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : lirmm-01620974, version 1

Citation

Renan Souza, Vitor Silva, José Camata, Alvaro Coutinho, Patrick Valduriez, et al.. Tracking of Online Parameter Fine-tuning in Scientific Workflows. Workflows in Support of Large-Scale Science (WORKS), in conjunction with ACM/IEEE Supercomputing., Nov 2017, Denver, United States. 2017. 〈lirmm-01620974〉

Partager

Métriques

Consultations de la notice

160

Téléchargements de fichiers

79