Unmasking Banking Fraud: Unleashing the Power of Machine Learning and Explainable AI (XAI) on Imbalanced Data

S. M.Nuruzzaman Nobel, Shirin Sultana, Sondip Poul Singha, Sudipto Chaki, Md Julkar Nayeen Mahi, Tony Jan, Alistair Barros, Md Whaiduzzaman

Research output: Contribution to journalArticlepeer-review

Abstract

Recognizing fraudulent activity in the banking system is essential due to the significant risks involved. When fraudulent transactions are vastly outnumbered by non-fraudulent ones, dealing with imbalanced datasets can be difficult. This study aims to determine the best model for detecting fraud by comparing four commonly used machine learning algorithms: Support Vector Machine (SVM), XGBoost, Decision Tree, and Logistic Regression. Additionally, we utilized the Synthetic Minority Over-sampling Technique (SMOTE) to address the issue of class imbalance. The XGBoost Classifier proved to be the most successful model for fraud detection, with an accuracy of 99.88%. We utilized SHAP and LIME analyses to provide greater clarity into the decision-making process of the XGBoost model and improve overall comprehension. This research shows that the XGBoost Classifier is highly effective in detecting banking fraud on imbalanced datasets, with an impressive accuracy score. The interpretability of the XGBoost Classifier model was further enhanced by applying SHAP and LIME analysis, which shed light on the significant features that contribute to fraud detection. The insights and findings presented here are valuable contributions to the ongoing efforts aimed at developing effective fraud detection systems for the banking industry.

Original languageEnglish
Article number298
JournalInformation (Switzerland)
Volume15
Issue number6
DOIs
Publication statusPublished - Jun 2024

Keywords

  • decision tree
  • fraud detection
  • LIME analysis
  • logistic regression
  • machine learning
  • oversampling SMOTE
  • SHAP analysis
  • SVM
  • XGBoost

Fingerprint

Dive into the research topics of 'Unmasking Banking Fraud: Unleashing the Power of Machine Learning and Explainable AI (XAI) on Imbalanced Data'. Together they form a unique fingerprint.

Cite this