Abstract

Nuclear magnetic resonance (NMR) is used in organic chemistry to identify unknown organic compounds. The data obtained from an NMR spectrometer are typically shown in the form of a spectrum, which is then analyzed by an analytical chemist. The action of analyzing a spectrum, especially one of a large and complex molecule, is a long and tedious process. In this project, Python is used to implement hierarchical clustering on NMR data obtained from an NMR spectrometer at the College of Wooster to explore its application in NMR analysis. MATLAB is used to build a decision tree from the same data, whose accuracy is compared to that of the hierarchical clustering. The decision tree is also examined to gain information about how to better automate the analysis process. These data clustering and classification processes are used to identify major functional groups within the compound from the spectral data, once feature extraction has been performed. Once these functional groups are identified, the compounds are clustered via hierarchical clustering, or classified with a decision tree. This processes provides insight into how to identify unknown organic molecules in a faster and more accurate manner, a much needed improvement in organic chemistry experimental research. It was found that decision trees are a much more accurate machine learning method to classify the organic compounds, when doing so based on present functional groups.

Advisor

Visa, Sofia

Department

Computer Science

Disciplines

Analytical Chemistry | Artificial Intelligence and Robotics | Computer Sciences | Organic Chemistry

Keywords

Machine learning, organic chemistry, NMR, nuclear magnetic resonance, hierarchical clustering, decision trees

Publication Date

2021

Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis Exemplar

Share

COinS
 

© Copyright 2021 Nicole Maia Powell