Publication:
Developing an ensembled machine learning model for predicting water quality index in Johor River Basin

dc.citedby4
dc.contributor.authorSidek L.M.en_US
dc.contributor.authorMohiyaden H.A.en_US
dc.contributor.authorMarufuzzaman M.en_US
dc.contributor.authorNoh N.S.M.en_US
dc.contributor.authorHeddam S.en_US
dc.contributor.authorEhteram M.en_US
dc.contributor.authorKisi O.en_US
dc.contributor.authorSammen S.S.en_US
dc.contributor.authorid35070506500en_US
dc.contributor.authorid56780374500en_US
dc.contributor.authorid56976224000en_US
dc.contributor.authorid57205236493en_US
dc.contributor.authorid25226555100en_US
dc.contributor.authorid57113510800en_US
dc.contributor.authorid6507051085en_US
dc.contributor.authorid57192093108en_US
dc.date.accessioned2025-03-03T07:41:31Z
dc.date.available2025-03-03T07:41:31Z
dc.date.issued2024
dc.description.abstractCurrently, the Water Quality Index (WQI) model becomes a widely used tool to evaluate surface water quality for agriculture, domestic and industrial. WQI is one of the simplest mathematical tools that can assist water operator in decision making in assessing the quality of water and it is widely used in the last years. The water quality analysis and prediction is conducted for Johor River Basin incorporating the upstream to downstream water quality monitoring station data of the river. In this research, the numerical method is first used to calculate the WQI and identify the classes for validating the prediction results. Then, two ensemble and optimized machine learning models including gradient boosting regression (GB) and random forest regression (RF) are employed to predict the WQI. The study area selected is the Johor River basin located in Johor, Peninsular Malaysia. The initial phase of this study involves analyzing all available data on parameters concerning the river, aiming to gain a comprehensive understanding of the overall water quality within the river basin. Through temporal analysis, it was determined that Mg, E. coli, SS, and DS emerge as critical factors affecting water quality in this river basin. Then, in terms of WQI calculation, feature importance method is used to identify the most important parameters that can be used to predict the WQI. Finally, an ensemble-based machine learning model is designed to predict the WQI using three parameters. Two ensemble ML approaches are chosen to predict the WQI in the study area and achieved a R2 of 0.86 for RF-based regression and 0.85 for GB-based ML technique. Finally, this research proves that using only the biochemical oxygen demand (BOD), the chemical oxygen demand (COD) and percentage of dissolved oxygen (DO%), the WQI can be predicted accurately and almost 96 times out of 100 sample, the water class can be predicted using GB ensembled ML algorithm. Moving forward, stakeholders may opt to integrate this research into their analyses, potentially yielding economic reliability and time savings. ? The Author(s) 2024.en_US
dc.description.natureFinalen_US
dc.identifier.ArtNo67
dc.identifier.doi10.1186/s12302-024-00897-7
dc.identifier.issue1
dc.identifier.scopus2-s2.0-85189181276
dc.identifier.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85189181276&doi=10.1186%2fs12302-024-00897-7&partnerID=40&md5=ce7a82dc6ff4ffa1b843e2eb68070528
dc.identifier.urihttps://irepository.uniten.edu.my/handle/123456789/36183
dc.identifier.volume36
dc.publisherSpringeren_US
dc.relation.ispartofAll Open Access; Gold Open Access
dc.sourceScopus
dc.sourcetitleEnvironmental Sciences Europe
dc.subjectJohor
dc.subjectJohor Basin
dc.subjectMalaysia
dc.subjectWest Malaysia
dc.subjectBiochemical oxygen demand
dc.subjectDecision making
dc.subjectDissolved oxygen
dc.subjectEscherichia coli
dc.subjectForecasting
dc.subjectForestry
dc.subjectMachine learning
dc.subjectMathematical operators
dc.subjectNumerical methods
dc.subjectQuality assurance
dc.subjectQuality control
dc.subjectRegression analysis
dc.subjectWater quality
dc.subjectWatersheds
dc.subjectGradient boosting
dc.subjectGradient boosting regression
dc.subjectIndex models
dc.subjectJohor river
dc.subjectMachine learning models
dc.subjectRandom forests
dc.subjectRiver basins
dc.subjectStudy areas
dc.subjectSurface water quality
dc.subjectWater quality indexes
dc.subjectdecision making
dc.subjecthydrological modeling
dc.subjectmachine learning
dc.subjectprediction
dc.subjectregression analysis
dc.subjectriver basin
dc.subjectwater quality
dc.subjectRivers
dc.titleDeveloping an ensembled machine learning model for predicting water quality index in Johor River Basinen_US
dc.typeArticleen_US
dspace.entity.typePublication
Files
Collections