Translating the FINREP taxonomy using a domain-specific corpus
Refereed Conference Meeting Proceeding
Our research investigates the use of statistical machine translation (SMT) to translate the labels of concepts in an XBRL taxonomy. Often taxonomy concepts are given labels in only one language. To enable knowledge access across languages, such monolingual taxonomies need to be translated into other languages. The primary challenge in label translation is the highly domain-specific vocabulary. To meet this challenge we adopted an approach based on the creation of domain-specific resources. Application of this approach to the translation of the FINREP taxonomy, translating from English to German, showed that it significantly outperforms SMT trained on general resources.
Machine Translation Summit XIV
Digital Object Identifer (DOI):
Open access repository: