Nuclear magnetic resonance (NMR) is used in organic chemistry to identify unknown organic compounds. The data obtained from an NMR spectrometer are typically shown in the form of a spectrum, which is then analyzed by an analytical chemist. The action of analyzing a spectrum, especially one of a large and complex molecule, is a long and tedious process. In this project, Python is used to implement hierarchical clustering on NMR data obtained from an NMR spectrometer at the College of Wooster to explore its application in NMR analysis. MATLAB is used to build a decision tree from the same data, whose accuracy is compared to that of the hierarchical clustering. The decision tree is also examined to gain information about how to better automate the analysis process. These data clustering and classification processes are used to identify major functional groups within the compound from the spectral data, once feature extraction has been performed. Once these functional groups are identified, the compounds are clustered via hierarchical clustering, or classified with a decision tree. This processes provides insight into how to identify unknown organic molecules in a faster and more accurate manner, a much needed improvement in organic chemistry experimental research. It was found that decision trees are a much more accurate machine learning method to classify the organic compounds, when doing so based on present functional groups.
Powell, Nicole Maia, "The Application of Machine Learning in Analyzing Organic Compounds from NMR Spectral Data" (2021). Senior Independent Study Theses. Paper 9471.
Analytical Chemistry | Artificial Intelligence and Robotics | Computer Sciences | Organic Chemistry
Machine learning, organic chemistry, NMR, nuclear magnetic resonance, hierarchical clustering, decision trees
Bachelor of Arts
Senior Independent Study Thesis Exemplar
© Copyright 2021 Nicole Maia Powell