Skip to main navigation Skip to search Skip to main content

Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters

  • Hai Tao
  • , Ali H. Jawad
  • , A. H. Shather
  • , Zainab Al-Khafaji
  • , Tarik A. Rashid
  • , Mumtaz Ali
  • , Nadhir Al-Ansari
  • , Haydar Abdulameer Marhoon
  • , Shamsuddin Shahid
  • , Zaher Mundher Yaseen
  • Qiannan Normal College for Nationalities
  • Guizhou University
  • Universiti Teknologi MARA
  • University of Alkitab
  • Al-Mustaqbal University College
  • University of Kurdistan Hewlêr
  • University of Southern Queensland
  • Luleå University of Technology
  • Al-Ayen University
  • University of Kerbala
  • Universiti Teknologi Malaysia
  • King Fahd University of Petroleum and Minerals

Research output: Contribution to journalArticlepeer-review

45 Scopus citations

Abstract

This study uses machine learning (ML) models for a high-resolution prediction (0.1°×0.1°) of air fine particular matter (PM2.5) concentration, the most harmful to human health, from meteorological and soil data. Iraq was considered the study area to implement the method. Different lags and the changing patterns of four European Reanalysis (ERA5) meteorological variables, rainfall, mean temperature, wind speed and relative humidity, and one soil parameter, the soil moisture, were used to select the suitable set of predictors using a non-greedy algorithm known as simulated annealing (SA). The selected predictors were used to simulate the temporal and spatial variability of air PM2.5 concentration over Iraq during the early summer (May-July), the most polluted months, using three advanced ML models, extremely randomized trees (ERT), stochastic gradient descent backpropagation (SGD-BP) and long short-term memory (LSTM) integrated with Bayesian optimizer. The spatial distribution of the annual average PM2.5 revealed the population of the whole of Iraq is exposed to a pollution level above the standard limit. The changes in temperature and soil moisture and the mean wind speed and humidity of the month before the early summer can predict the temporal and spatial variability of PM2.5 over Iraq during May-July. Results revealed the higher performance of LSTM with normalized root-mean-square error and Kling-Gupta efficiency of 13.4% and 0.89, compared to 16.02% and 0.81 for SDG-BP and 17.9% and 0.74 for ERT. The LSTM could also reconstruct the observed spatial distribution of PM2.5 with MapCurve and Cramer's V values of 0.95 and 0.91, compared to 0.9 and 0.86 for SGD-BP and 0.83 and 0.76 for ERT. The study provided a methodology for forecasting spatial variability of PM2.5 concentration at high resolution during the peak pollution months from freely available data, which can be replicated in other regions for generating high-resolution PM2.5 forecasting maps.

Original languageEnglish
Article number107931
JournalEnvironment International
Volume175
DOIs
StatePublished - May 2023
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Air quality prediction
  • Arid climate
  • Machine learning
  • PM concentration
  • Simulated annealing

Fingerprint

Dive into the research topics of 'Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters'. Together they form a unique fingerprint.

Cite this