Publication:
Parallel power load abnormalities detection using fast density peak clustering with a hybrid canopy-K-means algorithm

dc.citedby0
dc.contributor.authorAl-Jumaili A.H.A.en_US
dc.contributor.authorMuniyandi R.C.en_US
dc.contributor.authorHasan M.K.en_US
dc.contributor.authorSingh M.J.en_US
dc.contributor.authorPaw J.K.S.en_US
dc.contributor.authorAl-Jumaily A.en_US
dc.contributor.authorid57212194331en_US
dc.contributor.authorid14030355800en_US
dc.contributor.authorid55057479600en_US
dc.contributor.authorid58765817900en_US
dc.contributor.authorid58168727000en_US
dc.contributor.authorid57208087596en_US
dc.date.accessioned2025-03-03T07:45:58Z
dc.date.available2025-03-03T07:45:58Z
dc.date.issued2024
dc.description.abstractParallel power loads anomalies are processed by a fast-density peak clustering technique that capitalizes on the hybrid strengths of Canopy and K-means algorithms all within Apache Mahout's distributed machine-learning environment. The study taps into Apache Hadoop's robust tools for data storage and processing, including HDFS and MapReduce, to effectively manage and analyze big data challenges. The preprocessing phase utilizes Canopy clustering to expedite the initial partitioning of data points, which are subsequently refined by K-means to enhance clustering performance. Experimental results confirm that incorporating the Canopy as an initial step markedly reduces the computational effort to process the vast quantity of parallel power load abnormalities. The Canopy clustering approach, enabled by distributed machine learning through Apache Mahout, is utilized as a preprocessing step within the K-means clustering technique. The hybrid algorithm was implemented to minimise the length of time needed to address the massive scale of the detected parallel power load abnormalities. Data vectors are generated based on the time needed, sequential and parallel candidate feature data are obtained, and the data rate is combined. After classifying the time set using the canopy with the K-means algorithm and the vector representation weighted by factors, the clustering impact is assessed using purity, precision, recall, and F value. The results showed that using canopy as a preprocessing step cut the time it proceeds to deal with the significant number of power load abnormalities found in parallel using a fast density peak dataset and the time it proceeds for the k-means algorithm to run. Additionally, tests demonstrate that combining canopy and the K-means algorithm to analyze data performs consistently and dependably on the Hadoop platform and has a clustering result that offers a scalable and effective solution for power system monitoring. ? 2024 - IOS Press. All rights reserved.en_US
dc.description.natureFinalen_US
dc.identifier.doi10.3233/IDA-230573
dc.identifier.epage1346
dc.identifier.issue5
dc.identifier.scopus2-s2.0-85215378155
dc.identifier.spage1321
dc.identifier.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85215378155&doi=10.3233%2fIDA-230573&partnerID=40&md5=16f6c6c7217638414360d4fe7b424d1a
dc.identifier.urihttps://irepository.uniten.edu.my/handle/123456789/36943
dc.identifier.volume28
dc.pagecount25
dc.publisherIOS Press BVen_US
dc.sourceScopus
dc.sourcetitleIntelligent Data Analysis
dc.subjectAnomaly detection
dc.subjectCluster analysis
dc.subjectAbnormality detection
dc.subjectAbnormality detection and adjustment
dc.subjectApache mahout
dc.subjectCanopy algorithm
dc.subjectHybrid (CKMA)
dc.subjectK-mean algorithm
dc.subjectK-mean algorithms
dc.subjectLoad data
dc.subjectPower load
dc.subjectPower load data
dc.subjectK-means clustering
dc.titleParallel power load abnormalities detection using fast density peak clustering with a hybrid canopy-K-means algorithmen_US
dc.typeArticleen_US
dspace.entity.typePublication
Files
Collections