Development of a generic indexing tool to optimize the use of biological data
Développement d'un outil générique d'indexation pour optimiser l'exploitation de données biologiques
Abstract
This application is developed in the context of studies of genetic and phenotypic diversity in Asian and African rice (Oryza sativa and Oryza glaberrima). The objective of these studies is to identify by association genetics approaches some genes of interest in order to understand biological processes related to plant development and plasticity or disease resistance and their exploitation by breeding programs. These studies require handling large volumes of data that are heterogeneous and stored in different formats (Excel file, structured or semi-structured text, images, etc.). The volume and diversity of data can be a challenge for researchers for their optimal exploitation. The design of relational databases appears limited and not scalable. In this context, we have developed a tool for integration and generic indexing to navigate, share and annotate these data in order to exploit them. For this purpose, the project is based on a NoSQL management system Document-oriented - MongoDB, allowing data to be dynamically organized and modeled. The innovative aspect of this project is the development of a scalable system that allows users to perform all the steps from data integration to the query formulation.
Origin | Files produced by the author(s) |
---|