Multilingual Multimodal Machine Translation for Dravidian Languages utilizing Phonetic Transcription

Authors:

Bharathi Raja, Ruba Priyadharshini, Bernardo Stearns, Arun Jayapal, Sridevy S, Mihael Arcan, Manel Zarrouk, John McCrae

Publication Type:

Refereed Conference Meeting Proceeding

Abstract:

Multimodal machine translation is the task of translating from a source text into the target language using information from other modalities. Existing multimodal datasets have been restricted to only highly resourced languages. In addition to that, these datasets were collected by manual translation of English descriptions from the Flickr30K dataset. In this work, we introduce MMDravi, a Multilingual Multimodal dataset for under-resourced Dravidian languages. It comprises of 30,000 sentences which were created utilizing several machine translation outputs. Using data from MMDravi and a phonetic transcription of the corpus, we build an Multilingual Multimodal Neural Machine Translation system (MMNMT) for closely related Dravidian languages to take advantage of multilingual corpus and other modalities. We evaluate our translations generated by the proposed approach with human-annotated evaluation dataset in terms of BLEU metric. Relying on multilingual corpora, phonetic transcription, and image features, our approach improves the translation quality for the under-resourced languages.

Conference Name:

Workshop on Technologies for MT of Low-Resource Languages co-located with Machine Translation Summit (MT- Summit 2019), Dublin, Ireland.

Proceedings:

European Association for Machine Translation

Digital Object Identifer (DOI):

10.xx

Publication Date:

20/08/2019

Volume:

Proceedings of the 2nd Workshop on Technologies for MT of Low Resource Languages

Conference Location:

Ireland

Research Group:

Linked Data
Semantic Web

Institution:

National University of Ireland, Galway (NUIG)

Project Acknowledges:

European Lexicographic Infrastructure

Open access repository:

Yes

https://www.aclweb.org/anthology/W19-6809

Publication document: