Time is Everything: A Comparative Study of Human Evaluation of SMT vs. NMT

Abstract : Translation process research has developed tools to gather and analyse empirical data, but while a variety of measures have proved useful and reliable to assess machine translation post-editing effort (see e.g. Vieira 2016: 42), translation processes are seldom considered when assessing the relevance of a given Machine translation post-editing (MTPE) scenario. Our study seeks to determine the impact of including MTPE in the evaluation process. We selected adequacy and fluency ratings. Based on two distinct experimental conditions, we then compared the ratings produced without performing PE and those produced immediately after a light PE process. Inter-rater reliability was assessed for each segment in each text (N=55) using Fleiss' kappa for adequacy and fluency scores, and an intra class correlation coefficient (Vieira 2016: 52) for temporal measures. While the reliability of the measures collected without PE was low, the measures collected in PET were for the most part homogeneous. Qualitative analyses of the problematic segments, as evidenced by both kappa and intra class correlation coefficients, showed strong Spearman's correlations, whether positive or negative, between temporal measures and all the other metrics for NMT but weakest ones for SMT. Based on these results, we discuss the advantages and risks of NMTPE.
Document type :
Conference papers
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02363210
Contributor : Emmanuelle Esperança-Rodier <>
Submitted on : Thursday, November 14, 2019 - 12:02:57 PM
Last modification on : Wednesday, November 20, 2019 - 1:06:28 AM

File

tc41_Final_Time_9.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02363210, version 1

Collections

LIG | CNRS | UGA

Citation

Emmanuelle Esperança-Rodier, Caroline Rossi. Time is Everything: A Comparative Study of Human Evaluation of SMT vs. NMT. Translating and the computer 41, Nov 2019, Londres, United Kingdom. ⟨hal-02363210⟩

Share

Metrics

Record views

18

Files downloads

43