So much language learning now takes place online – the Duolingo app, for example, has over 20m daily active users. This is great news for multilingualism but in order to maintain language diversity we must ensure that minority and historical languages also benefit from digital transformation. This is challenging. The cutting edge machine learning and natural language processing techniques that are used to build digital language resources require large quantities of text data. This means that the communities and academics that use these minority and historical languages need to feed large amounts of words and phrases into these digital learning models.
Many minority and historical languages are severely under-resourced, from a digital point of view, with users having minimal opportunities to contribute to these digital resources. Furthermore, there are technical and financial obstacles in the path of historical linguists when it comes to creating and using digital resources.
There is a pressing need to remove these barriers to entry for the creation and upkeep of digital text resources.
The Cardamom Workbench is a web-based tool that allows users to generate text for under-resourced languages. It has an intuitive, graphical user interface and no technical expertise is required to use it.
The technologies improve as more text and annotations are added by users and contributors are credited with the creation of text and annotation.
Cardamom can empower language communities to drive the digitisation of their own languages, giving those languages a digital presence that will preserve the existing body of knowledge and foster the survival and evolution of minority and historical languages and cultures for future generations.
Cardamom is an Irish Research Council Laureate Award for basic research awarded to Insight’s Dr John McCrae. The Cardamom team also comprises Insight members Bernardo Stearns and Adrian Doyle.
Find out more at https://www.cardamom-project.org/