You are here

Active learning for text classification with reusability

Authors: 

Rong Hu, Brian MacNamee, Sarah Jane Delany

Publication Type: 
Refereed Original Article
Abstract: 
Whereactivelearningwithuncertaintysamplingisusedtogeneratetrainingsetsforclassificationappli-cations,itissensibletousethesametypeofclassifiertoselectthemostinformativetrainingexamplesasthetypeofclassifierthatwillbeusedinthefinalclassificationapplication.Therearescenarios,however,wherethismightnotbepossible,forexampleduetocomputationalcomplexity.Suchscenariosgiverisetothereusabilityproblem—arethetrainingexamplesdeemedmostinformativebyoneclassifiertypenecessar-ilyasinformativeforadifferentclassifiertypes?Thispaperdescribesanovelexplorationofthereusabilityproblemintextclassificationscenarios.Wemeasuretheimpactofusingdifferentclassifiertypesintheac-tivelearningprocessandintheclassificationapplicationsthatusetheresultsofactivelearning.Weperformexperimentsonfourdifferenttextclassificationproblems,usingthethreeclassifiertypesmostcommonlyusedfortextclassification.Wefindthatthereusabilityproblemisasignificantissueintextclassification;that,ifpossible,thesameclassifiertypeshouldbeusedbothintheapplicationandduringtheactivelearningprocess;andthat,iftheultimateclassifiertypeisunknown,supportvectormachinesshouldbeusedinactivelearningtomaximisereusability.
Digital Object Identifer (DOI): 
10.1016/j.eswa.2015.10.003
Publication Status: 
Published
Publication Date: 
10/11/2015
Journal: 
Expert Systems With Applications
Institution: 
National University of Ireland, Dublin (UCD)
Open access repository: 
No