You are here

Self-selection bias of similarity metrics in translation memory evaluation

Authors: 

Friedel Wolff, Laurette Pretorius, Loïc Dugast, Paul Buitelaar

Publication Type: 
Refereed Original Article
Abstract: 
A translation memory system attempts to retrieve useful suggestions from previous translations to assist a translator in a new translation task. While assisting the translator with a specific segment, some similarity metric is usually employed to select the best matches from previously translated segments to present to a translator. Automated methods for evaluating a translation memory system usually use reference translations and some similarity metric. Such evaluation methods might be expected to assist in choosing between competing systems. No single evaluation method has gained widespread use; additionally the similarity metric used in each of these methods is not standardised either. This paper investigates the consequences of substituting the similarity metric in such an evaluation method, and finds that the similarity metrics exhibit a strong bias for the system using the same metric for retrieval. Consequently the choice of similarity metric in the evaluation of translation memory systems should be carefully reconsidered.
Digital Object Identifer (DOI): 
10.1007/s10590-016-9185-8
ISSN: 
0922-6567
Publication Status: 
Published
Date Accepted for Publication: 
Tuesday, 20 December, 2016
Publication Date: 
21/01/2017
Journal: 
Machine Translation
Pages: 
1-16
Research Group: 
Institution: 
National University of Ireland, Galway (NUIG)
Open access repository: 
No