False positive reduction endeavors with automated feature engineering
MetadataShow full item record
- Master of Science 
Credit card fraud has been a problem for decades, and with the booming trend of online shopping fraud losses expected to rise for every year to come. Fraud detection systems often generate more false positives than true positives in order to attain a higher detection level of fraudulent transactions. These false positives have plagued the fraud detection industry for years as they are expensive to investigate and require extensive manual labor. An automated feature engineering approach was implemented to address the problem of high false positives while at the same time conserving most of the true positives. We generate a high feature space (1750 features) of rich features without manual intervention other than specifying the primitives. In addition, a feature reduction method is implemented to retain the features with the highest predictive power to counteract the dimensionality problem of the method. To compare our results, there were two additional datasets created for benchmarking purposes. The first dataset only included the cleaned original features, referred to as the baseline. In the second dataset, we generated manual features from the original data to reproduce the situation of a domain expert. The proposed solution was tested with the XGBoost to quantify the effect of the automated feature engineering on the reduction of false positives and was compared to the benchmarking datasets. Our analysis of the results shows that automated feature engineering can improve false positives by 84% while managing to retain 89% of the true positives compared to the baseline dataset. In addition, we find no significant difference between automated and manual feature engineering on the discarding of false positives, and both methods are equally good. However, the results suggest that an automated approach can cut down feature engineering time a lot while providing richer features than manual feature engineering, suggesting a potential for bottom-line savings by reducing the number of domain experts and improved efficiency in the analytical life cycle.
Masteroppgave(MSc) in Master of Science in Business Analytics - Handelshøyskolen BI, 2021