An open source speech synthesis module for a visual-speech recognition system

Sotiris Manitsaris; Bruce Denby; Florent Xavier; Jun Cai; Maureen Stone; Pierre Roussel; Gérard Dreyfus

Communication Dans Un Congrès Année : 2012

An open source speech synthesis module for a visual-speech recognition system

, (1) , , (1) , (2) , (1) , (1)

1
2

Sotiris Manitsaris

Fonction : Auteur correspondant
PersonId : 19356
IdHAL : sotiris-manitsaris
ORCID : 0000-0003-4552-1793
IdRef : 221655220

Connectez-vous pour contacter l'auteur

Bruce Denby

Fonction : Auteur
PersonId : 905746

Laboratoire Signaux, Modèles et Apprentissage Statistique

Florent Xavier

Fonction : Auteur
PersonId : 939941

Jun Cai

Fonction : Auteur
PersonId : 905747

Laboratoire Signaux, Modèles et Apprentissage Statistique

Maureen Stone

Fonction : Auteur
PersonId : 905750

Vocal Tract Visualization Lab [Baltimore]

Pierre Roussel

Fonction : Auteur
PersonId : 905748

Laboratoire Signaux, Modèles et Apprentissage Statistique

Gérard Dreyfus

Fonction : Auteur
PersonId : 905749

Laboratoire Signaux, Modèles et Apprentissage Statistique

Résumé

A Silent Speech Interface (SSI) is a voice replacement technology that permits speech communication without vocalisation. The visual-speech recognition engine of the proposed SSI is based on vocal tract imaging. The system aims to give the laryngectomised speaker the opportunity to speak with his/her original voice. This paper presents the speech synthesis module of a SSI that uses the open-source MaryTTS (Text-To-Speech). The visual-speech recognition engine of the SSI outputs a text sentence, which is imported to the speech synthesis module in order to synthesise speech in French or English. A new module of phonetic transcription has been developed and integrated into MaryTTS. In addition, English and French semi-HMM (Hidden Markov Models) model voices have been built. The SSI can be remotely controlled using a mobile device and the new voices are installed in a Web Server.

Mots clés

visual-speech recognition speech synthesis imaging modelling

Domaines

Acoustique [physics.class-ph]

Fichier principal

hal-00811261.pdf (628.8 Ko)

Origine	Fichiers éditeurs autorisés sur une archive ouverte

HAL System : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00811261

Soumis le : lundi 23 avril 2012-10:00:00

Dernière modification le : vendredi 19 avril 2024-16:18:57

Archivage à long terme le : dimanche 18 décembre 2016-14:20:44

Dates et versions

hal-00811261 , version 1 (23-04-2012)

Identifiants

HAL Id : hal-00811261 , version 1

Citer

Sotiris Manitsaris, Bruce Denby, Florent Xavier, Jun Cai, Maureen Stone, et al.. An open source speech synthesis module for a visual-speech recognition system. Acoustics 2012, Apr 2012, Nantes, France. ⟨hal-00811261⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UPMC ESPCI CNRS PARISTECH SIGMA ACOUSTICS2012 PSL

372 Consultations

525 Téléchargements

An open source speech synthesis module for a visual-speech recognition system

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager