Publication:
Malware Detection Using Deep Learning and Correlation-Based Feature Selection

dc.citedby70
dc.contributor.authorAlomari E.S.en_US
dc.contributor.authorNuiaa R.R.en_US
dc.contributor.authorAlyasseri Z.A.A.en_US
dc.contributor.authorMohammed H.J.en_US
dc.contributor.authorSani N.S.en_US
dc.contributor.authorEsa M.I.en_US
dc.contributor.authorMusawi B.A.en_US
dc.contributor.authorid58668473000en_US
dc.contributor.authorid57226309117en_US
dc.contributor.authorid57862594800en_US
dc.contributor.authorid57202657688en_US
dc.contributor.authorid57196190931en_US
dc.contributor.authorid57203682775en_US
dc.contributor.authorid57439487000en_US
dc.date.accessioned2024-10-14T03:21:51Z
dc.date.available2024-10-14T03:21:51Z
dc.date.issued2023
dc.description.abstractMalware is one of the most frequent cyberattacks, with its prevalence growing daily across the network. Malware traffic is always asymmetrical compared to benign traffic, which is always symmetrical. Fortunately, there are many artificial intelligence techniques that can be used to detect malware and distinguish it from normal activities. However, the problem of dealing with large and high-dimensional data has not been addressed enough. In this paper, a high-performance malware detection system using deep learning and feature selection methodologies is introduced. Two different malware datasets are used to detect malware and differentiate it from benign activities. The datasets are preprocessed, and then correlation-based feature selection is applied to produce different feature-selected datasets. The dense and LSTM-based deep learning models are then trained using these different versions of feature-selected datasets. The trained models are then evaluated using many performance metrics (accuracy, precision, recall, and F1-score). The results indicate that some feature-selected scenarios preserve almost the same original dataset performance. The different nature of the used datasets shows different levels of performance changes. For the first dataset, the feature reduction ratios range from 18.18% to 42.42%, with performance degradation of 0.07% to 5.84%, respectively. The second dataset reduction rate is between 81.77% and 93.5%, with performance degradation of 3.79% and 9.44%, respectively. � 2023 by the authors.en_US
dc.description.natureFinalen_US
dc.identifier.ArtNo123
dc.identifier.doi10.3390/sym15010123
dc.identifier.issue1
dc.identifier.scopus2-s2.0-85146783397
dc.identifier.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85146783397&doi=10.3390%2fsym15010123&partnerID=40&md5=f2337f655af4b8f4a26c0b963638681a
dc.identifier.urihttps://irepository.uniten.edu.my/handle/123456789/34699
dc.identifier.volume15
dc.publisherMDPIen_US
dc.relation.ispartofAll Open Access
dc.relation.ispartofGold Open Access
dc.sourceScopus
dc.sourcetitleSymmetry
dc.subjectdeep learning
dc.subjectdense model
dc.subjectfeature selection
dc.subjectLSTM
dc.subjectmalware detection
dc.titleMalware Detection Using Deep Learning and Correlation-Based Feature Selectionen_US
dc.typeArticleen_US
dspace.entity.typePublication
Files
Collections