According to the National Institutes of Mental Health (NIMH), depressive disorders (or major depression) are considered one of the most common and serious health risks in the United States. Our study focuses on extracting non-medical factors of depressive disorders diagnosis, such as overall health states, health risk behaviors, demography, and healthcare access, using the Behavioral Risk Factor Surveillance System (BRFSS) data set collected by the Centers for Disease Control and Prevention (CDC) in 2018.

We set the two objectives of our study about depressive disorders diagnosis in the United States as follows. First, we aim to utilize machine learning algorithms and statistical methods to build models that will discover the factors of depressive disorders for young, middle, and old adulthood in the United States. Second, based on the mined attributes from each adult group, we predict depressive disorders for each group and evaluate the performances of those prediction tasks. Throughout the study, we obtain an in-depth understanding of what impacts the depressive disorders diagnosis for each adult group in the United States, as well as how machine learning and statistical approaches are useful in mining information about the factors and predicting the depressive disorders.


Visa, Sofia

Second Advisor

Frazier, Marian


Computer Science; Mathematics


Artificial Intelligence and Robotics | Data Science | Statistics and Probability


Machine Learning, Statistical Models, Artificial Intelligence in Healthcare, Predictive Analytics, Decision trees, Logistic regression, Support vector machines, Depression

Publication Date


Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis Exemplar



© Copyright 2021 Minhwa Lee