InsightInsight
IPIC-Ribbon-Horizontal-2-Small
  • About
    • What We Do
    • Governance
    • Equality, Diversity and Inclusion
  • People
    • Work With Us
    • Senior Leadership
    • Principal Investigators
    • Funded Investigators
    • Research and Operations
  • Research
    • Central Bank PhD Programme
    • Excellence
    • Funding Collaboration
    • MSCA Postdoctoral Fellowships
    • National Projects
    • European Projects
  • Industry
    • Collaborate
    • Insight Brochure
    • Commercialisation
    • Contact
  • Public Engagement
    • Meet the Team
    • Highlights
    • Insight Scholarship
  • News
    • Spotlight on Research
    • Events
    • Newsletter
    • Press Releases
  • Contact
  • About
    • What We Do
    • Governance
    • Equality, Diversity and Inclusion
  • People
    • Work With Us
    • Senior Leadership
    • Principal Investigators
    • Funded Investigators
    • Research and Operations
  • Research
    • Central Bank PhD Programme
    • Excellence
    • Funding Collaboration
    • MSCA Postdoctoral Fellowships
    • National Projects
    • European Projects
  • Industry
    • Collaborate
    • Insight Brochure
    • Commercialisation
    • Contact
  • Public Engagement
    • Meet the Team
    • Highlights
    • Insight Scholarship
  • News
    • Spotlight on Research
    • Events
    • Newsletter
    • Press Releases
  • Contact

A Survey of Current Datasets for Code-Switching Research

Insight>Publications

Authors:

Navya Jose, Bharathi Raja, Shardul Suryawanshi, Elizabeth Sherly, John McCrae

Publication Type:

Refereed Conference Meeting Proceeding

Abstract:

Code-switching is a prevalent phenomenon in the multilingual community and social media interaction. In the past ten years, we have witnessed an explosion of code switched data in the social media that brings together languages from low resourced languages to high resourced languages in the same text, sometimes written in a non-native script. This increases the demand for processing code-switched data to assist users in various natural language processing tasks such as part-of-speech tagging, named entity recognition, sentiment analysis, conversational systems, and machine translation, etc. The available corpora for code switching research played a major role in advancing this area of research. In this paper, we propose a set of quality metrics to evaluate the dataset and categorize them accordingly.

Conference Name:

2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS)

Proceedings:

2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS)

Digital Object Identifer (DOI):

10.1109/ICACCS48705.2020.9074205

Publication Date:

06/03/2020

Pages:

136-141

Conference Location:

India

Research Group:

Linked Data

Institution:

National University of Ireland, Galway (NUIG)

Open access repository:

Yes

https://ieeexplore.ieee.org/abstract/document/9074205

Publication document:

A Survey of Current Datasets for Code-Switching Research

Insight_host_partners_funder
Ireland's European Structural and Investment Funds Programme 2014-2022 logo
European Union European Regional Development Fund logo
  • Privacy Statement
  • Copyright Statement
  • Data Protection Notice
  • Accessibility Statement