The purpose of this study is to utilize machine learning techniques to recognize sarcasm in social media comments, in order to more accurately understand their content. In the age of social media, it has become more difficult to predict the interests of younger generations through focus grouping. The use of social media to gauge how users feel about a topic using machine learning algorithms has recently become a way to replace them. However, social media comments are often heavily sarcastic, meaning that the sentiment being expressed is the opposite of what the comment explicitly says. Accurately determining the opinion of users based on comments therefore requires taking sarcasm into account. Using a corpus of comments from the social media website Reddit, which are tagged as either sarcastic or not, and the scikit learn Python library, this study investigates various machine learning approaches to text classification, such as naive Bayes classifiers, decision trees, and random forest classifier. Other similar studies done in the past are also investigated. The results from this study can be compared against that of past works, and then taken to a new direction.


Sommer, Nathan


Computer Science

Publication Date


Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis



© Copyright 2018 Mary-Hannah E. Boyer