Thursday, April 28, 2016 at 2:00 PM in Rice 404
Advisor: Hongning Wang
Committee Members: : Mary Lou Soffa (Committee Chair), Worthy Martin and Yanjun Qi.
Title: Collaborative Model Adaptation for Personalized Sentiment Classification
ABSTRACT: Text-based sentiment classification forms the foundation of opinion mining. Current mainstream solutions of sentiment classification mostly focus on building population-level supervised classifiers, which estimate and apply a shared classification model across all users’ opinionated data. This postulates that the joint probability of sentiment labels and text content is independent and identical across different users. Nevertheless, humans’ opinions are so idiosyncratic and variable: the same opinions can be expressed in various ways and the same expression might carry distinct sentimental polarities in different users. For example, the word “expensive” tends to be associated with negative opinions in restaurant reviews, although some users use it to describe their satisfaction with a restaurant’s quality. A global sentiment classifier can hardly recognize such variances across users and will consequently lead to suboptimal opinion mining results. Explicitly modeling the heterogeneity among users with user-level sentiment models is hence of great practical value.
Estimating a personalized sentiment classifier is challenging. A straightforward solution is to estimate supervised classifiers on a per-user basis; but sparse observations of individual users’ opinionated data make it ineffective, e.g., prone to over-fitting. Some existing work utilizes semi-supervised methods to address the data sparsity issue, i.e., applying transductive learning algorithms on user-user and user-document relations. However, only one global model can be estimated in such solutions, and it cannot capture the nuance in which individual users express diverse opinions. More importantly, due to the dynamic nature of how users express their opinions, a static sentiment model cannot capture temporal changes in a user’s opinions; timely update of the personalized models is hence necessary to accurately recognize the polarity of opinions in each individual user. This requires effective exploitation of users’ own opinionated data and efficient execution of model update across all users in the collection.