Active Learning: Past, Present, and Future
Department of Computer Science
Illinois Institute of Technology
A fundamental task of machine learning is prediction, where a model is built using existing input-output pairs, and then it is applied to future instances where the input is known but output is not. Examples include spam detection, sentiment analysis, and movie recommendation. Constructing enough training data for predictive models is a tedious and costly process where expert and user feedback is needed: emails need to be classified as spam/ham, phrases in reviews need to be tagged as positive/negative, and movies need to be rated. Active learning is the subfield of machine learning that aims to train an accurate model with minimal expert and user feedback.
Active learning has been studied in the past two decades and many methods have been developed. In this talk, I will provide a survey of the most-frequently utilized active learning strategies. In addition to providing theoretical background, I will discuss results of an extensive empirical study highlighting strengths and weaknesses of these strategies. I will conclude with current and future research trends with an example applied to homophilic networks.