Skip to Main content Skip to Navigation
Conference papers

Time is Everything: A Comparative Study of Human Evaluation of SMT vs. NMT

Abstract : Translation process research has developed tools to gather and analyse empirical data, but while a variety of measures have proved useful and reliable to assess machine translation post-editing effort (see e.g. Vieira 2016: 42), translation processes are seldom considered when assessing the relevance of a given Machine translation post-editing (MTPE) scenario. Our study seeks to determine the impact of including MTPE in the evaluation process. We selected adequacy and fluency ratings. Based on two distinct experimental conditions, we then compared the ratings produced without performing PE and those produced immediately after a light PE process. Inter-rater reliability was assessed for each segment in each text (N=55) using Fleiss' kappa for adequacy and fluency scores, and an intra class correlation coefficient (Vieira 2016: 52) for temporal measures. While the reliability of the measures collected without PE was low, the measures collected in PET were for the most part homogeneous. Qualitative analyses of the problematic segments, as evidenced by both kappa and intra class correlation coefficients, showed strong Spearman's correlations, whether positive or negative, between temporal measures and all the other metrics for NMT but weakest ones for SMT. Based on these results, we discuss the advantages and risks of NMTPE.
Document type :
Conference papers
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download
Contributor : Emmanuelle Esperança-Rodier <>
Submitted on : Thursday, November 14, 2019 - 12:02:57 PM
Last modification on : Friday, July 10, 2020 - 7:58:58 AM
Document(s) archivé(s) le : Saturday, February 15, 2020 - 2:25:55 PM


Files produced by the author(s)


  • HAL Id : hal-02363210, version 1



Emmanuelle Esperança-Rodier, Caroline Rossi. Time is Everything: A Comparative Study of Human Evaluation of SMT vs. NMT. Translating and the computer 41, Nov 2019, Londres, United Kingdom. ⟨hal-02363210⟩



Record views


Files downloads