Paul Buitelaar, Suzanne Little and Bharathi Raja Chakravarthi headshots

Insight Excellence: Multimodal Data Analysis for Equality, Diversity and Inclusion (EDI)

Submitted on Thursday, 11/05/2023

In order to guarantee equality, diversity and inclusion across social media platforms, technological solutions need to be developed to identify and track non-compliant messaging at a large scale. This has become of increasing societal importance as well as legal and commercial importance to the big tech companies who provide social media platforms. A main challenge in this area is in clearly defining the objective of the task from a computational perspective, while taking into consideration the societal and subjective aspects of EDI. Our research has focused on the development of data sets for this task as well as the development of innovative algorithms for addressing this challenge accurately and at scale.

Over the past few years, systems have been developed to control online content and eliminate abusive, offensive or hate speech content. However, people in power sometimes misuse this form of censorship to obstruct the democratic right of freedom of speech. Therefore, Insight researchers are working on taking a positive reinforcement approach towards online content that is encouraging, positive and supportive. Insight Galway has constructed a Hope Speech dataset (HopeEDI) and Homophobia/Transphobia Detection dataset for Equality, Diversity and Inclusion containing user-generated comments from the social media platform and Speech Recognition for Vulnerable Individuals dataset (vulnerable old-aged and transgender people) (Chakravarthi et al, 2022). Building on the existing research, the team built a language model that recognises hopeful speech by training using the English translations of the code-mixed dataset. They provide effective cross-lingual transfer between the languages and outperform other SOTA models (Hande et al, 2022). In addition, Insight also organised several events in this research area, including several international challenges to establish benchmarks and baselines. In 2022, Insight built upon previously developed automatic approaches to multimodal content classification in regard to offensive or hateful messaging in social media and beyond . Finally, with international colleagues, Insight has developed new methods for dealing with data imbalance in the Hope Speech detection problem (RamakrishnaIyer et al, 2023). All the data is made publicly available.

Investigators: Paul Buitelaar (UG), Bharathi Raja Chakravarthi (UG), John McCrae (UG), Suzanne Little (DCU)

Pictured: Paul Buitelaar (UG), Suzanne Little (DCU) and Bharathi Raja Chakravarthi