Finding the balance: How machine learning is the only route to scalable, accurate fraud prevention

Bradley

3 years ago

Despite the stereotype of forged documents in dimly lit basements, fraud has become a sophisticated digital operation with professional hierarchies and processes. Specialized groups run different links in the value chain: some steal payment information; others test credit cards or sell stolen credentials; still others fraudulently acquire goods. And they compete with one another to provide the best service, devising new techniques to stay ahead.

E-commerce professionals need effective protections against this growing fraud industrial complex. The best tools are dynamic, adapting to shifts in fraudsters’ methods and techniques—without blocking legitimate customers. Customer behavior also evolves with time, so keeping fraud tools dynamic also helps reduce false positives, even in times where fraudsters aren’t leveraging new approaches.

But these solutions are too expensive for most businesses to build themselves. As a result, smaller businesses waste time manually reviewing transactions and lose vital revenue to fraud attacks, while large enterprises find that their purpose-built fraud operations cost millions, are hard to keep up-to-date, and drain resources from core business functions.

Worse, purpose-built solutions don’t fully address the issue. Fraud is a network problem: a small cadre of professionals targeting millions of businesses, exploiting different vulnerabilities. The most robust solutions can learn from those businesses’ collective experiences to understand the real nature of the problem—and how to fight back.

Adaptive, large-scale machine learning—made available as a service—is that solution. It analyzes large datasets to spot patterns invisible to humans, while being configurable to businesses’ particular fraud risks. Machine learning powers tools like Radar, the fraud prevention system I work on at Stripe, evaluating transactions for fraud risk and taking appropriate action.

How do these systems work?

At its core, Stripe Radar tries to answer one question: is a given transaction likely to be legitimate, or fraudulent? To work that out, its machine learning uses hundreds of data points about each transaction, like a credit card’s issuing country, the time and location of the purchase, the value of the goods bought, or how those payment details have been used in prior transactions.

As Radar learns, it analyzes the hundreds of billions of transactions that have already taken place on the Stripe network, learning to associate these features with either fraudulent or legitimate activity. Stripe supplements this with data about disputes from card issuers—a direct line to ground-truth training data about which transactions were legitimate or fraudulent–that leads to higher-quality machine learning. When it’s deployed, Radar evaluates transactions for fraud risk and can automatically block those it considers suspicious.

Scale is the magic ingredient. The larger the dataset, the more likely it is to include a given card, merchant, or fraud attack pattern. That helps machine learning work out even more nuanced patterns of fraudulent and nonfraudulent behavior. As the network grows, so does ML’s performance: a growing network and a faster velocity of updates to its fraud model meant Radar increased the amount of fraud it caught by 20% last year, with no discernible increase in false positives.

Why does this matter for businesses?

Preventing fraud can be critical to a business’s ability to operate. Our user Adblock was experiencing so much of a type of fraud called card testing—where fraudsters make a high number of fraudulent payments to check that stolen credit card details work—that they were nearly cut off from the card networks.

“We were on the cusp of losing the ability to process payments due to the number of fraudulent transactions,” says Matt Maier, Adblock CEO. “That’s when we started using Stripe Radar. With Radar, we were able to programmatically fight fraud and institute fine grained ways to fight back against card testers.”

How can my business get started with machine learning?

It’s not enough for machine learning to be effective. It also needs to be simple to use. Providing ML-driven fraud prevention as a service solves several problems for merchants.

First, merchants can find the optimal balance of minimal fraud and maximum conversion. When businesses use brute-force techniques to minimize disputes and chargebacks, they often can’t easily measure the number of legitimate transactions they block. They risk turning away many more customers than they have to, lowering their revenue and damaging their reputation. A third of customers won’t shop again with a merchant that falsely declined them.

Second, when integrated into payments, machine learning can make other improvements to the payments flow. Think of a captcha—those boxes that ask you to “click on all the boats” to prove you are not running an automated script. To help prevent fraud, Stripe users can deploy captchas to verify suspicious customers based on the Radar score. Because that system is powered by large-scale ML, legitimate customers almost never see them and aren’t turned away.

Third, fraud prevention as a service is ready out of the box. No need to spend time on upfront integration work, hand-labeling data, or shuffling transaction data between a payment processor and fraud providers: Radar receives this information straight from the payment flow.

Finally, ML-as-service means businesses can tap into the engineering talent that tracks the latest fraud trends—without needing to hire specialists on the payroll. Effective fraud prevention needs to adapt to changing trends in fraud techniques; Radar’s performance would degrade by half a percentage point each month without ongoing retraining. Stripe’s engineers have built Radar to learn continuously from recent transactions, tripling the speed at which they release new fraud prevention models, which prevents millions of dollars more fraud while letting millions of dollars more legitimate transactions through.

Large-scale ML keeps businesses a step ahead of fraudsters for minimal effort—without spoiling the customer experience. If you’re interested in learning more about just how a machine learning model is developed and deployed, be sure to check out Stripe’s Machine Learning Guide.

This article was contributed by Emmanuel Ameisen, Machine Learning Engineer at Stripe