Abstract

This thesis presents a comparative study of the performances of different machine learning models
for forecasting the PM2.5 concentration in several cities in Mongolia. The study aims to develop
two types of forecasting models for PM2.5 concentration, which is a significant environmental and
public health issue in Mongolia. The first model is a Neural Network model which is used in
areas without air quality monitor and past air quality data available to be used for PM2.5 forecast
model. The second type of model includes the Neural Network Autoregression (NNAR), Seasonal
Autoregressive Integrated Moving Average (SARIMA), and Seasonal Auto-Regressive Integrated
Moving Average with Exogenous factors (SARIMAX) models. These models are for cities with
historic PM2.5 concentration data available to forecast PM2.5.
The dataset used for this study includes daily PM2.5 concentration data as well as other
meteorological data collected from air quality monitors and weather stations in Ulaanbaatar, Dornod,
Gobi-Altai, Khovd, Selenge, Tuv, Uvs, Uvurkhangai, and Zavkhan cities from September 2021 to
March 2023. The study first pre-processes the data, including data cleaning, and then implements
these models on the processed data. The performances of these models are evaluated using various
statistical metrics such as Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The
comparison of different model results shows that the NNAR model performed the best when using
daily PM2.5 concentration data in Ulaanbaatar, with MAE = 16.55 μg m−3 and RMSE = 23.01 μg m−3.
This model also explained 55% of the variability (R = 0.55) in the PM2.5 concentration.

Advisor

Horr, Christina

Department

Statistical and Data Sciences

Keywords

Mongolia, PM2.5

Publication Date

2023

Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis

Share

COinS
 

© Copyright 2023 Sumiyabazar Ganbaatar