Healtcare Trajectory Mining by Combining Multi-dimensional Component and Itemsets
Résumé
Sequential pattern mining is an approach to extract corre- lations among temporal data. Many different methods were proposed to either enumerate sequences of set valued data (i.e., itemsets) or sequences containing multidimensional items. However, in many real-world scenar- ios, data sequences are described as events of both multi-dimensional and set valued informations. These rich heterogeneous descriptions can- not be exploited by traditional approaches. For example, in healthcare domain, hospitalizations are defined as sequences of multi-dimensional attributes (e.g. Hospital or Diagnosis) associated with sets of medical procedures (e.g. { Radiography, Appendectomy }). In this paper we pro- pose a new approach called MMISP (Mining Multi-dimensional-Itemset Sequential Patterns) to extract patterns from sequences including both multi-dimensional and set valued data. The novelties of the proposal lies in: (i) the way in which the data can be efficiently compressed; (ii) the ability to reuse a state-of-the-art sequential pattern mining algo- rithm and (iii) the extraction of new kind of patterns. We introduce as a case-study, experiments on real data aggregated from a regional health- care system and we point out the usefulness of the extracted patterns. Additional experiments on synthetic data highlights the efficiency and scalability of our approach.
Origine | Fichiers produits par l'(les) auteur(s) |
---|
Loading...