Publication: Air quality forecasting and mapping in Malaysian urban areas: A hybrid deep learning approach
Date
2025-03-05
Authors
Nur'atiah Zaini
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Rapid growth in industrialization and urbanization has resulted in a high concentration of air pollutants in the environment, thus causing severe air pollution that has negatively impacted the health and well-being of human society. Numerous forecasting models have been developed for air quality forecasting, such as statistical and traditional machine learning and deep learning models. However, the applications appear to have limitations in solving larger nonlinear time series datasets and neglect the effect of meteorology parameters and surrounding air pollutant concentrations for accurate forecasting. Therefore, this study aims to forecast hourly air pollutant concentrations, namely fine particulate matter (PM2.5), inhalable particulate matter (PM10), sulphur dioxide (SO2), nitrogen dioxide (NO2), carbon monoxide (CO) and ozone (O3) at two air quality monitoring sites namely Batu Muda and Cheras located in Kuala Lumpur, Malaysia based on advanced hybrid deep learning models. Firstly, data decomposition method, namely ensemble empirical mode decomposition (EEMD), was employed to decompose the original sequence data of target air quality indicators into several subseries. Optimization algorithms called sparrow search algorithm (SSA) and particle swarm optimization (PSO) were used to determine the optimum hyperparameter values of the deep learning approach, such as number of hidden neurons, batch size and learning rate. Long short-term memory (LSTM) was applied to individually forecast the decomposed subseries of the target air pollutant,
considering the influence of other air pollutant indicators, meteorology parameters and neighbouring air pollutant concentrations for 1 hour ahead of the forecasting horizon. The developed deep learning models were implemented to forecast six main air pollutant concentrations at both target monitoring sites based on the optimum configuration of input variables and specific historical input. Finally, the performance of the developed models was evaluated in terms of four evaluation metrics, namely root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE) and coefficient of determination (R2 ). The results demonstrate that advanced hybrid deep learning models, namely EEMD-SSA-LSTM and EEMD-PSO-LSTM, exhibit superior performance based on larger air quality datasets compared to LSTM, GRU, EEMD-LSTM, PSO-LSTM and SSA-LSTM, illustrating the capability of the models in learning nonlinear time series data with various temporal dimensionalities. The advanced models also demonstrate superior performance for forecasting air pollutant concentration at two global cities, indicating the reliability and versatility of the models in learning and forecasting air pollutant concentration under various pollution levels and climate conditions. Moreover, neighbouring air pollutant concentrations significantly improve the forecasting accuracy of deep learning models in most cases. Overall, EEMD-SSA-LSTM exhibits better forecasting performance compared to EEMD-PSO-LSTM at Batu Muda and Cheras with RMSE and R2 of 3.67 g/m3 and 0.98, and 3.60 g/m3 and 0.99, respectively for forecasting PM2.5 concentration. The results indicate the advantages
of the advanced model in air quality forecasting, which incorporates various exogenous parameters and neighbouring information.
Description
2024