Weak Supervision for Semi-Supervised Topic Modeling via Word Embeddings
Refereed Original Article
Semi-supervised algorithms have been shown to improve the results of topic modeling when applied to unstructured text corpora. However, sucient supervision is not always available. This paper pro- poses a new process, Weak+, suitable for use in semi-supervised topic modeling via matrix factorization, when limited supervision is available. This process uses word embeddings to provide additional weakly-labeled data, which can result in improved topic modeling performance.
Digital Object Identifer (DOI):
Date Accepted for Publication:
Monday, 3 April, 2017
Conference: Conference on Language, Data and Knowledge
Open access repository: