Refereed Conference Meeting Proceeding
Neologism detection is a key task in the constructing of lexical resources and has wider implications for NLP, however the identification of multiword neologisms has received little attention. In this paper, we show that we can effectively identify the distinction between compositional and non-compositional adjective-noun pairs by using pretrained language models and comparing this with individual word embeddings. Our results show that the use of these models significantly improves over baseline linguistic features, however the combination with linguistic features still further improves the results, suggesting the strength of a hybrid approach.
Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019) at ACL 2019
Digital Object Identifer (DOI):
National University of Ireland, Galway (NUIG)
Open access repository: