Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness using Machine Translation
Refereed Conference Meeting Proceeding
Thispaperprovidesacomparativeanalysisoftheperformanceoffour state-of-the-art distributional semantic models (DSMs) over 11 languages, contrasting the native language-specific models with the use of machine translation over English-based DSMs. The experimental results show that there is a signif- icant improvement (average of 16.7% for the Spearman correlation) by using state-of-the-art machine translation approaches. The results also show that the benefit of using the most informative corpus outweighs the possible errors intro- duced by the machine translation. For all languages, the combination of machine translation over the Word2Vec English distributional model provided the best re- sults consistently (average Spearman correlation of 0.68).
Knowledge Engineering and Knowledge Management (EKAW)
Digital Object Identifer (DOI):
National University of Ireland, Galway (NUIG)
Open access repository: