Data and Machine Learning Model Management with Gypscie - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
Conference Papers Year : 2022

Data and Machine Learning Model Management with Gypscie

Abstract

As predictive analytics using ML models (or models for short) become preva- lent in different stages of scientific exploration, a new set of artifacts are pro- duced during the models’ life-cycle that need to be managed [2]. In addition to the models with their evolving versions, ML life-cycle artifacts include the collected training data and pre-processing workflows, data labels and selected features, model training, tuning and monitoring statistics and provenance in- formation. However, to realize the full potential of data science, these artifacts must be built and combined, which can be very complex as there can be many to select from. Furthermore, they should be shared and reused, in particular, in different execution environments such as HPC or Spark clusters. In order to support the complete ML life-cycle process and produced arti- facts, we have been developing the Gypscie framework, which offers collaborat- ing researchers a common software infrastructure to develop, share, improve and publish ML artifacts.
Fichier principal
Vignette du fichier
CARLA_Gypscie_SAVIME.pdf (273.65 Ko) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

lirmm-03799097 , version 1 (05-10-2022)

Identifiers

  • HAL Id : lirmm-03799097 , version 1

Cite

Fábio Porto, Patrick Valduriez. Data and Machine Learning Model Management with Gypscie. CARLA 2022 - Workshop on HPC and Data Sciences meet Scientific Computing, SCALAC, Sep 2022, Porto Alegre, Brazil. pp.1-2. ⟨lirmm-03799097⟩
107 View
123 Download

Share

More