The purpose of this study is to utilize machine learning techniques to recognize sarcasm in social media comments, in order to more accurately understand their content. In the age of social media, it has become more difficult to predict the interests of younger generations through focus grouping. The use of social media to gauge how users feel about a topic using machine learning algorithms has recently become a way to replace them. However, social media comments are often heavily sarcastic, meaning that the sentiment being expressed is the opposite of what the comment explicitly says. Accurately determining the opinion of users based on comments therefore requires taking sarcasm into account. Using a corpus of comments from the social media website Reddit, which are tagged as either sarcastic or not, and the scikit learn Python library, this study investigates various machine learning approaches to text classification, such as naive Bayes classifiers, decision trees, and random forest classifier. Other similar studies done in the past are also investigated. The results from this study can be compared against that of past works, and then taken to a new direction.
Boyer, Mary-Hannah E., "Recognizing Sarcasm Using Machine Learning" (2018). Senior Independent Study Theses. Paper 7972.
Bachelor of Arts
Senior Independent Study Thesis
© Copyright 2018 Mary-Hannah E. Boyer