Fraud detection is a cornerstone of contemporary e-commerce, yet it is usually certainly one of the least publicized domains in Machine Learning. That’s for a great reason: it’s an adversarial domain, where fraudsters always invent latest ways to bypass existing models, and model developers always invent latest ways to catch them.
The goal of fraud detection systems is to dam fraudulent transactions, similar to those placed by fake accounts using stolen bank cards, while at the identical time stopping any friction to the shopping experience of real customers. False negatives (fraud transactions that mistakenly went through the system) end in monetary loss also generally known as ‘bad debt’ on account of chargebacks initiated by the actual bank card owners, while false positives (real transactions that were blocked) end in poor customer experience and churn.
Consider that a contemporary e-commerce provider may process somewhere within the order of tens of Tens of millions of orders per day, and that fraud rates are on the sub-percent level, and also you’re beginning to see why this can be a difficult domain. It’s the final word needle-in-a-haystack problem, where the haystacks are overwhelmingly large and…