Data and Machine Learning Model Management with Gypscie - Archive ouverte HAL Access content directly
Conference Papers Year : 2022

Data and Machine Learning Model Management with Gypscie

(1) , (2)
1
2

Abstract

As predictive analytics using ML models (or models for short) become preva- lent in different stages of scientific exploration, a new set of artifacts are pro- duced during the models’ life-cycle that need to be managed [2]. In addition to the models with their evolving versions, ML life-cycle artifacts include the collected training data and pre-processing workflows, data labels and selected features, model training, tuning and monitoring statistics and provenance in- formation. However, to realize the full potential of data science, these artifacts must be built and combined, which can be very complex as there can be many to select from. Furthermore, they should be shared and reused, in particular, in different execution environments such as HPC or Spark clusters. In order to support the complete ML life-cycle process and produced arti- facts, we have been developing the Gypscie framework, which offers collaborat- ing researchers a common software infrastructure to develop, share, improve and publish ML artifacts.
Fichier principal
Vignette du fichier
CARLA_Gypscie_SAVIME.pdf (273.65 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

lirmm-03799097 , version 1 (05-10-2022)

Identifiers

  • HAL Id : lirmm-03799097 , version 1

Cite

Patrick Valduriez, Fabio Porto. Data and Machine Learning Model Management with Gypscie. CARLA 2022 - Workshop on HPC and Data Sciences meet Scientific Computing, SCALAC, Sep 2022, Porto Alegre, Brazil. pp.1-2. ⟨lirmm-03799097⟩
8 View
5 Download

Share

Gmail Facebook Twitter LinkedIn More