بهبود روش شناسایی وب سایت فیشینگ با استفاده از داده‌کاوی روی صفحات وب

محورهای موضوعی : عمومی

1 - دانشگاه آزاد اسلامی، واحد علوم و تحقیقات تهران
2 - پژوهشگاه ارتباطات و فناوری اطلاعات

تاریخ دریافت : 1399/02/01 تاریخ پذیرش : 1399/07/30 تاریخ انتشار : 1399/07/30

کلید واژه: فیشینگ, داده‌کاوی, انتخاب ویژگی, استخراج ویژگی,

چکیده مقاله :

فیشینگ یک نوع حمله اینترنتی در سطح وب است که هدف آن سرقت مشخصات فردی کاربران برای دزدی آنلاین است. فیشینگ دارای اثر منفی در از بین بردن اعتماد بین کاربران در کسب‌وکارهای الکترونیکی است؛ بنابراین در این تحقیق سعی بر بررسی روشهای تشخیص وب سایت‌های فیشینگ با استفاده از داده کاوی شده است. شناسایی ویژگی‌های برجسته از فیشینگ یکی از پیش‌شرط‌های مهم در طراحی یک سیستم تشخیصی دقیق است؛ لذا در گام اول، برای شناسایی ویژگی‌های نفوذ فیشینگ یک لیست با 30 ویژگی مطرح در وب‌سایت‌های فیشینگ آماده گردید. سپس برای افزایش کارایی سامانه‌های تشخیص فیشینگ روش جدیدی جهت کاهش ویژگی ها در دومرحله‌ مبتنی بر انتخاب ویژگی و استخراج ویژگی پیشنهاد شده است که موجب می شود تعداد ویژگی‌ها به‌طور قابل‌توجهی کاهش یابند. پس‌ازآن عملکرد روش‌های درخت تصمیم J48، جنگل تصادفی و بیزین ساده بر روی ویژگی‌های کاهش‌یافته موردبررسی قرار گرفت. نتایج نشان می‌دهند دقت مدل ایجاد شده برای تعیین وب سایت‌های فیشینگ با استفاده از کاهش ویژگی دومرحله‌ای مبتنی بر پوششی و الگوریتم تحلیل مؤلفه اصلی (PCA) در روش جنگل تصادفی ۹۶٫۵۸% می‌باشد که نسبت به سایر روش‌ها نتیجه مطلوبی است.

چکیده انگلیسی:

Phishing plays a negative role in reducing the trust among the users in the business network based on the E-commerce framework. therefore, in this research, we tried to detect phishing websites using data mining. The detection of the outstanding features of phishing is regarded as one of the important prerequisites in designing an accurate detection system. Therefore, in order to detect phishing features, a list of 30 features suggested by phishing websites was first prepared. Then, a two-stage feature reduction method based on feature selection and extraction were proposed to enhance the efficiency of phishing detection systems, which was able to reduce the number of features significantly. Finally, the performance of decision tree J48, random forest, naïve Bayes methods were evaluated{cke_protected_1}{cke_protected_2}{cke_protected_3}{cke_protected_4} on the reduced features. The results indicated that accuracy of the model created to determine the phishing websites by using the two-stage feature reduction based Wrapper and Principal Component Analysis (PCA) algorithm in the random forest method of 96.58%, which is a desirable outcome compared to other methods.

منابع و مأخذ:

]1[ اسماعیلی، مهدی، مفاهیم و تکنیک‌های داده‌کاوی، کاشان: سوره، 1392. http://www. p30download.com/fa/entry/53064
]2[ حاتمی خواه، نفیسه، "بررسی روش‌های مبتنی بر انتخاب ویژگی"، تهران، دانشگاه صنعتی مالک اشتر، 1392. http://ceit.aut.ac.ir. ]دسترسی در 21/3/1396[.
]3[ سعیدی، پریسا، "بررسی سیستم‌های هوشمند تشخیص وب‌سایت فیشینگ در بانکداری الکترونیکی به روش منطق فازی"، نخستین کنفرانس بین‌المللی فناوری اطلاعات، تهران: مرکز همایش‌های توسعه ایران، 1394. https://www.civilica.com/Paper-FBFI01-FBFI01_144.html
]4[ لنگری، نفیسه، عبدالرزاق نژاد، مجید، "شناسایی وبگاه فیشینگ در بانکداری اینترنتی با استفاده از الگوریتم بهینه‌سازی صفحات شیب‌دار"، مجله پدافند الکترونیکی و سایبری. شماره 1، صفحه 29-40، 1394.
]5[ محمدی، شهریار، غروی، عرفانه، "کاربرد تکنیک‌های داده‌کاوی جهت تشخیص آدرس‌های فیشینگ"، کنگره ملی مهندسی برق، کامپیوتر و فناوری اطلاعات، مشهد: موسسه آموزش عالی خیام، 1392. https://www.civilica.com/Paper-CECIT01-CECIT01_555.html
]6[ معاونی, مسعود، "تشخیص حملات در بانکداری الکترونیکی با استفاده از سیستم ترکیبی فازی-راف (Fuzzy _rough)" گروه کامپیوتر دانشگاه امام رضا (ع)، 1394، http://moaveni.ir، ]دسترسی در 9/3/1396[.
]7[ ورسلیز، کارلو، هوش تجاری داده‌کاوی و بهینه‌سازی برای تصمیم‌گیری، ترجمه‌ی احمدی، عباس، محبی، آزاده، ویرایش دوم، تهران، نشر دانشگاه صنعتی امیرکبیر (پلی‌تکنیک تهران)، زمستان 1392.
[8] Abdelhamid, N., Ayesh, A., Thabtah, F., “Phishing detection based Associative Classification data mining”, Expert Systems with Applications 41 5948–5959, 2014.
[9] Aburrous, M., Hossain, M. A., Keshav, D., Thabtah, F., “Predicting Phishing Websites using Classification Mining Techniques with Experimental Case Studies”, IEEE Seventh International Conference on Information Technology, pp. 176-181, 2010.
[10] Abur-rous, M. R. M., “Phishing Website Detection Using Intelligent Data Minning Techniques”, Ph.D, dissertation, Dept. Computing, Bradford Univ, Bradford, 2010.
[11] Anti Phishing Working Group, Phishing activity trends report, http://www.antiphishing.org/resources/apwg-reports/apwg_trends_report_q4_2019.pdf.
[12] Aravindhan, R., Shanmugalakshmi, Dr.R., Ramya, K., Dr.Selvan C, “Certain Investigation on Web Application Security:Phishing Detection and Phishing Target Discovery”, 2016 3rd International Conference on Advanced Computing and Communication Systems (ICACCS -2016), Jan. 22 – 23, 2016, Coimbatore, INDIA, Available: IEEE Xplore, http://www.ieee.org.
[13] Basnet, R. B., Sung, A.H., Liu, Q., “Feature Selection for Improved Phishing Detection”, international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, pp 252-261, 2012, Available: https://link.springer.com.
[14] Buber, E., Demir, Ö., Sahingoz, O.K., “Feature Selections for the Machine Learning based Detection of Phishing Websites”, International Artificial Intelligence and Data Processing Symposium (IDAP) IEEE, 2017.
[15] Chaudhry, J. A., Rittenhouse, R. G., “Phishing: Classification and Countermeasures”, 7th International Conference on Multimedia, Computer Graphics and Broadcasting, pp. 28-31, IEEE, 2015.
[16] Hadi, W., Aburub, F., Alhawari, S., “A new fast associative classification algorithm for detecting phishing websites”, Applied Soft Computing 48 (2016) 729–734.
[17] Khonji, M., Jones, A., Iraqi, Y., “A Study of Feature Subset Evaluators and Feature Subset Searching Methods for Phishing Classification”, Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, pp.135-144, ACM, 2011.
[18] Kohavi, Ron, “A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection”, Proceedings of the 14th international joint conference on Artificial intelligence (IJCAI), pp. 1137-1143, ACM, 1995.
[19] Kohavi, R., John, G. H., “Wrappers for feature subset selection”, Artificial Intelligence,Vol. 97, pp. 273-324, 1997.
[20] Lakhita, Yadav, S., Bohra, B., Pooja, “A Review on Recent Phishing Attacks in Internet”, IEEE International Conference on Green Computing and Internet of Things (ICGCIoT), pp. 1312-1315, 2015.
[21] Mohammad, R. M., Thabtah, F., McCluskey, L., “Tutorial and critical analysis of phishing websites methods”, Computer Science Review 17 (2015) 1-24.
[22] Mohammad, R. M., Thabtah, F., McCluskey, L., Phishing Website Dataset, https://archive.ics.uci.edu/ml/datasets/ Phishing+websites, 2015.
[23] Pandey, M., Ravi, V., “Detecting phishing e-mails using Text and Data mining”, IEEE International Conference on Computational Intelligence and Computing Research(ICCIC), 2012.
[24] Pandey, M., Ravi, V., “Text and Data Mining to Detect Phishing Websites and Spam Emails”, Proceedings of the 4th International Conference on Swarm, Evolutionary, and Memetic Computing, Vol. 8298, pp.559-573, 2013.
[25] PhishTank.http://www.phishtank.com,2017.
[26] rahmi A. H., isredza, Abawajy, J., “Phishing Email Feature Selection Approach”, 10th International Joint Conference of IEEE TrustCom., pp. 916-921, 2011.
[27] Sanglerdsinlapachai, N., Rungsawang, A., “Using Domain Top-page Similarity Feature in Machine Learning-based Web Phishing Detection”, Third International Conference on Knowledge Discovery and Data Mining, IEEE, pp. 17-190, 2010.
[28] Singh, P., Jain, N., Maini, A., “Investigating the Effect Of Feature Selection and Dimensionality Reduction On Phishing Website Classification Problem”, 1st International Conference on Next Generation Computing Technologies (NGCT) Dehradun, India, IEEE, pp. 388-393, 2015.

اشتراک گذاری

آدرس مقاله

بهبود روش شناسایی وب سایت فیشینگ با استفاده از داده‌کاوی روی صفحات وب