Financial Fraud Detection — Applied Machine Learning

This project tackled the real-world problem of financial fraud, a global crisis estimated at $3.1 trillion in illicit flows in 2023 alone.

As a core team member, I contributed to the design and execution of the full data science pipeline. We engineered a synthetic dataset inspired by real-world sources including PaySim and IBM’s Anti-Money Laundering dataset, incorporating behavioral, transactional, and demographic features. I was involved in data cleaning, feature selection, and model development using PyCaret, where we benchmarked multiple classification algorithms — logistic regression, random forest, and gradient boosting — evaluated on AUC, precision, recall, and F1-score. We supplemented the supervised model with an Isolation Forest unsupervised anomaly detection approach, which independently flagged ~5% of transactions as suspicious, aligning well with the supervised results. The project also included a rigorous ethical analysis covering algorithmic bias, false positive risks, and fairness considerations — areas directly relevant to real-world financial deployment.

Course

Applied Data Science - Tufts

Role

Team Member

Skills

Machine Learning, Python & PyCaret, Feature Engineering, Data Visualization, Model Evaluation & Validation, Ethical AI Analysis

Project Document

Click to view

Presentation

Click to view

date