Computational Ecology and Software, 2013, 3(3): 61-73
[XML] [EndNote] [RefManager] [BibTex] [ Full PDF (165K)] [Comment Article]


Machine learning algorithms for predicting roadside fine particulate matter concentration level in Hong Kong Central

Yin Zhao , Yahya Abu Hasan
School of Mathematical Sciences, Universiti Sains Malaysia (USM), Penang, Malaysia

Received 1 May 2013;Accepted 5 June 2013;Published online 1 September 2013

Data mining is an approach to discover knowledge from large data. Pollutant forecasting is an important problem in the environmental sciences. This paper tries to use data mining methods to forecast fine particles (PM2.5) concentration level in Hong Kong Central, which is a famous business centre in Asia. There are several classification algorithms available in data mining, such as Artificial Neural Network (ANN) and Support Vector Machine (SVM). ANN and SVM are both machine learning algorithm used in variant area. This paper builds PM2.5 concentration level predictive models based on ANN and SVM by using R packages. The data set includes 2008-2011 period meteorological data and PM2.5 data. The PM2.5 concentration is divided into 2 levels: low and high. The critical point is 40g/m3 (24 hours mean), which is based on the standard of US Environmental Protection Agency (EPA). The parameters of both models are selected by multiple cross validation. According to 100 times 10-fold cross validation, the testing accuracy of SVM is around 0.803-0.820, which is much better than ANN whose accuracy is around 0.746-0.793.

Keywords Artificial Neural Network (ANN);Support Vector Machine (SVM);PM2.5 prediction;data mining;machine learning.

International Academy of Ecology and Environmental Sciences. E-mail: office@iaees.org
Copyright © 2009-2022 International Academy of Ecology and Environmental Sciences. All rights reserved.
Web administrator: website@iaees.org; Last modified: 2022/1/29

Translate page to: