Multichannel Audio Modeling with Elliptically Stable Tensor Decomposition

Mathieu Fontaine 1 Fabian-Robert Stöter 2, 3 Antoine Liutkus 2, 3 Umut Simsekli 4 Romain Serizel 1 Roland Badeau 4
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
2 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
3 LIRMM/HE - Hors Équipe
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : This paper introduces a new method for multichannel speech enhancement based on a versatile modeling of the residual noise spec-trogram. Such a model has already been presented before in the single channel case where the noise component is assumed to follow an alpha-stable distribution for each time-frequency bin, whereas the speech spec-trogram, supposed to be more regular, is modeled as Gaussian. In this paper, we describe a multichannel extension of this model, as well as a Monte Carlo Expectation-Maximisation algorithm for parameter estimation. In particular, a multichannel extension of the Itakura-Saito nonnegative matrix factorization is exploited to estimate the spectral parameters for speech, and a Metropolis-Hastings algorithm is proposed to estimate the noise contribution. We evaluate the proposed method in a challenging multichannel denoising application and compare it to other state-of-the-art algorithms.
Type de document :
Communication dans un congrès
LVA ICA 2018 - 14th International Conference on Latent Variable Analysis and Signal Separation, Jul 2018, Surrey, United Kingdom. 2018
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01766795
Contributeur : Antoine Liutkus <>
Soumis le : samedi 14 avril 2018 - 09:58:31
Dernière modification le : vendredi 20 avril 2018 - 09:37:46

Fichier

LVA-ICA2018_046_original_v5.pd...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : lirmm-01766795, version 1

Citation

Mathieu Fontaine, Fabian-Robert Stöter, Antoine Liutkus, Umut Simsekli, Romain Serizel, et al.. Multichannel Audio Modeling with Elliptically Stable Tensor Decomposition. LVA ICA 2018 - 14th International Conference on Latent Variable Analysis and Signal Separation, Jul 2018, Surrey, United Kingdom. 2018. 〈lirmm-01766795〉

Partager

Métriques

Consultations de la notice

52

Téléchargements de fichiers

26