Enhanced Phishing Detection : A Hybrid SVM-Genetic Algorithm Approach

Authors

  • Mohammad M Elsheh Department of Computer Science  Libyan Academy  Misurata Libya
  • Sarah Al-mabrouk Ebayou Department of Computer Science  Libyan Academy Misurata Libya

DOI:

https://doi.org/10.65540/jar.v30i1.1362

Keywords:

Genetic Algorithm, Machine Learning, Phishing Website Detection, Support Vector Machine

Abstract

The majority of public and financial institutions have recently upgraded and enhanced the direct online services they offer to their clients due to the rise in internet applications and users. However, the majority of web users are unaware of internet security measurements.  Hence, attacks on various online platforms are gradually increasing. Attackers use various methods to steal users' sensitive information; one of the most common scams is phishing websites. Therefore, there is a need to fight these attacks and constantly improve detection technologies, including machine learning (ML) methods. ML methods classify whether a site is phishing or not based on a number of pieces of data obtained from other webpages. Therefore, this paper aims to present a model for detecting and classifying phishing websites using the Support Vector Machine (SVM) model optimized using the Genetic Algorithm (GA) to obtain the best classification accuracy. The collected dataset consists of 12,000 samples. The phishing URLs were collected from the PhishTank website, while the legitimate ones were from the Kaggle website. Furthermore, accuracy, precision, recall, and the F1-score were used to evaluate the performance of the presented method. The obtained results were compared to the results of previous research, which was conducted using SVM algorithms with Ant Colony Optimization (ACO). The attained results showed that the classification accuracy of the presented approach achieved 97.62%, which is higher than the traditional SVM model by 9.29% and almost equal to the SVM-ACO model.

References

Haenlein, M., and Kaplan, A.: ‘A brief history of artificial intelligence: On the past, present, and future of artificial intelligence’, California management review, 2019, 61, (4), pp. 5-14 DOI: https://doi.org/10.1177/0008125619864925

Helm, J.M., Swiergosz, A.M., Haeberle, H.S., Karnuta, J.M., Schaffer, J.L., Krebs, V.E., Spitzer, A.I., and Ramkumar, P.N.: ‘Machine learning and artificial intelligence: definitions, applications, and future directions’, Current reviews in musculoskeletal medicine, 2020, 13, (1), pp. 69-76. DOI: https://doi.org/10.1007/s12178-020-09600-8

Bhatia, P.: ‘Data mining and data warehousing: principles and practical techniques’ (Cambridge University Press, 2019. 2019). DOI: https://doi.org/10.1017/9781108635592

Ludl, C., McAllister, S., Kirda, E., and Kruegel, C., On the effectiveness of techniques to detect phishing sites, in The Detection of Intrusions and Malware, and Vulnerability Assessment, Springer, 2007, pp. 20–39 DOI: https://doi.org/10.1007/978-3-540-73614-1_2

Aburrous, M., Mohammed, R., Dahal, K., and Thabtah, F. (2011). Phishing website detection using intelligent data mining techniques, University of Bradford.

Tanaka and J. Suzuki, “Web and Database Technologies”, Proc. of ACM SIGMOD, pp. 10-22, 201 APWG, “Phishing activity trends report, 3rd Quarter 2018".

Eint, S., Chaw, T., Hayato, Y., A Survey of URL-based Phishing Detection, DEIM Forum, 2019.

M. M. Elsheh and K. Swayeb, "Phishing Website Detection Using a Hybrid Approach Based on Support Vector Machine and Ant Colony Optimization," in 2023 IEEE 3rd International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering (MI-STA), 2023, pp. 402-406. DOI: https://doi.org/10.1109/MI-STA57575.2023.10169464

Pandey, A., Gill, N., Sai Prasad Nadendla, K., and Thaseen, I.S.: ‘Identification of phishing attack in websites using random forest-svm hybrid model’, in Editor (Ed.)^(Eds.): ‘Book Identification of phishing attack in websites using random forest-svm hybrid model’ (Springer, 2020, edn.), pp. 120-128 DOI: https://doi.org/10.1007/978-3-030-16660-1_12

A. Ozcan, C. Catal, E. Donmez, and B. Senturk, "A hybrid DNN–LSTM model for detecting phishing URLs," Neural Computing and Applications, vol. 35, pp. 4957-4973, 2023. DOI: https://doi.org/10.1007/s00521-021-06401-z

S. Remya, M. J. Pillai, K. K. Nair, S. R. Subbareddy, and Y. Y. Cho, "An effective detection approach for phishing URL using ResMLP," IEEE access, vol. 12, pp. 79367-79382, 2024. DOI: https://doi.org/10.1109/ACCESS.2024.3409049

Elsheh, M.M. and Abolawaifa, E., 2025. Hybrid Stacking Ensemble Model for Phishing URL Detection Using PCA and Machine Learning. Journal of Technology Research, pp.515-525. DOI: https://doi.org/10.26629/jtr.2025.48

PhishTank. (2023). Developer Information. Available: https://www.phishtank.com/developer_info.php. Accessed: Apr. 24, 2023.

Kaggle. (2023). Developer Information. Available: https:// https://www.kaggle.com/datasets. Accessed: Apr. 24, 2023.

M. Aydin and N. Baykal, "Feature extraction and classification phishing websites based on URL," 2015. DOI: https://doi.org/10.1109/CNS.2015.7346927

A. S. Raja, R. Vinodini, and A. Kavitha, "Lexical features based malicious URL detection using machine learning techniques," Materials Today: Proceedings, vol. 47, pp. 163-166, 2021. DOI: https://doi.org/10.1016/j.matpr.2021.04.041

W. Ali and A. A. Ahmed, "Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm‐based feature selection and weighting," IET Information Security, vol. 13, pp. 659-669, 2019 DOI: https://doi.org/10.1049/iet-ifs.2019.0006

Q. Zou, L. Ni, T. Zhang, and Q. Wang, "Deep learning-based feature selection for remote sensing scene classification," IEEE Geoscience and remote sensing letters, vol. 12, pp. 2321-2325, 2015. DOI: https://doi.org/10.1109/LGRS.2015.2475299

Scikit-learn. (2023). Sklearn.svm.SVC. available: https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html. Accessed: Apr. 23,2023.

B. Bischl, M. Binder, M. Lang, T. Pielok, J. Richter, S. Coors, et al., "Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges," Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 13, p. e1484, 2023. DOI: https://doi.org/10.1002/widm.1484

J. Brownlee, Optimization for machine learning: Machine Learning Mastery, 2021.

S.-J. Bu and H.-J. Kim, "Optimized URL Feature Selection Based on Genetic-Algorithm-Embedded Deep Learning for Phishing Website Detection," Electronics, vol. 11, p. 1090, 2022. DOI: https://doi.org/10.3390/electronics11071090

Sklearn-deap. (2023). Available: https://github.com/rsteca/sklearn-deap/blob/master/evolutionary_search/optimize.py . Accessed: May, 16,2023.

Downloads

Published

2026-01-02

How to Cite

Elsheh, M. M., & Ebayou, S. A.- mabrouk. (2026). Enhanced Phishing Detection : A Hybrid SVM-Genetic Algorithm Approach. Journal of Academic Research, 30(1), 09–20. https://doi.org/10.65540/jar.v30i1.1362

Issue

Section

Basic Sciences