"You can't just turn the crank": Machine learning for fighting abuse on the consumer web

"You can't just turn the crank"
Machine learning for ﬁghting abuse on the consumer web
David Freeman
Research Scientist/Engineer, Facebook Inc.
ScAINet 2018
Atlanta, GA USA, 11 May 2018

What do they try to do?
Malware
Payment 
Fraud
Scraping
Click
Fraud
Phishing Spam Social 
Engineering
Fake
Products
Scams
"Like"
FraudPromotion
Fraud
Identity
Theft
What do we see?
Fake
Reviews
Misinfor-
mation
Financial
Theft
Account
Resale
Fundamental question: Which requests are bad?
• Perfect for machine learning!

What could possibly go wrong?
Machine learning workﬂow
Label
Train
Validate
Launch
Measure
Proﬁt!Lots!

How do we obtain labeled data?
(hint: not from your users)
Label

• Human labeling of random samples.
• Labelers don't always know what they're looking for
• Labelers are inconsistent (with themselves and each other)
• Labelers get tired (esp. if most samples are good)
• Apply crowdsourcing best practices:
• Precise deﬁnitions, multiple labeling, ML-assisted sampling
• But will it scale?
Labeling: Gold standard
Objective measurement

• Find high-precision signals of badness
• Examples: unusual user-agent, malformed header
• DO NOT BLOCK ON THESE SIGNALS
• They are controlled by the adversary
• When the adversary adapts you will lose visibility
• Automatically generate signals using anomaly detection.
Labeling: Silver standard
Automatic labeling

• Use whatever you have!
• CS data, rules, other models 
• Mitigate risks of blindness and feedback loops:
• Oversample manually labeled examples
• Oversample false positives and false negatives when retraining.
• Undersample positive examples from previous iterations of this model.
• Sample and label examples near the decision boundary
Labeling: Bronze standard
Be scrappy

• Users are terrible at
reporting.
• Product ﬂows bias
reporting.
• Reports can be gamed.
• Reports can serve as an
directional measure.
Labeling: Iron standard
Have your users do the work

• Segment the problem
• e.g. status with link from country X
• Downsample intelligently
• if your distribution is lumpy, sample from all the lumps
• Learning the prior vs. focusing on the bad stuﬀ
• no golden rule here -- you have to experiment
Assembling a training set
Labeling is just the beginning

{Training set 2
{
Model v2
Refreshing your data
Don't forget the past!
{Training set 1 {
Model v1
Mitigation:
• Keep old attacks around (exponential decay?)
• Keep old models around (raise thresholds?)
{
Training set N
{
Model vN

How do you know your model is ready to go?
Train
Validate

• Labels aren't perfect
• Often miss on recall 
• Models interact with each other
• Use oﬄine P-R and ROC to stack-rank model candidates
Validating Performance
Don't trust oﬄine replay Model B
FP
Model A

• Fundamental A/B testing assumption: 
Experiment eﬀects are independent of the cohorts chosen
 
The Perils of A/B Testing

A B
X
• Looks good so far....
Start with a small experiment

A B
X
• Did the adversary give up or iterate?
Roll it out to (almost) everyone — Option 1

A B
• Now your experiment is a vulnerability
Roll it out to (almost) everyone — Option 2

• Run new model online in "log-only" mode
• Evaluate performance where the new
model disagrees with the old one.
• ideally via sampling & labeling
• Push based on FP/FN tradeoﬀ
Using Shadow Mode
Prod model
FP
New model

How do you ﬁgure out if it worked?
Launch
Measure

True Positives Don't Matter
What's happening here?
Time
Precision

• Really want # of good users aﬀected
• Solution: use one minus speciﬁcity (aka FPR)
True Positives Don't Matter
What's happening here?
Time
TP
Time
FP vs.
Time
FP
Time
TP
1
TN
FP + TN<latexit sha1_base64="FfpNUyvhzZ48h+ueQ2Hy3cgzX50=">AAACDXicbZDLSgMxFIYz9VbrbdSlm2AVBLHMiKDLoiCupEJv0BlKJs20oUlmSDJiGeYF3Pgqblwo4ta9O9/GtJ2Ftv4Q+PKfc0jOH8SMKu0431ZhYXFpeaW4Wlpb39jcsrd3mipKJCYNHLFItgOkCKOCNDTVjLRjSRAPGGkFw6txvXVPpKKRqOtRTHyO+oKGFCNtrK594MIT6IUS4dTjQfSQ1m+zLMfrGjyG43vXLjsVZyI4D24OZZCr1rW/vF6EE06Exgwp1XGdWPspkppiRrKSlygSIzxEfdIxKBAnyk8n22Tw0Dg9GEbSHKHhxP09kSKu1IgHppMjPVCztbH5X62T6PDCT6mIE00Enj4UJgzqCI6jgT0qCdZsZABhSc1fIR4gE402AZZMCO7syvPQPK24TsW9OytXL/M4imAP7IMj4IJzUAU3oAYaAINH8AxewZv1ZL1Y79bHtLVg5TO74I+szx8ZwZry</latexit><latexit sha1_base64="FfpNUyvhzZ48h+ueQ2Hy3cgzX50=">AAACDXicbZDLSgMxFIYz9VbrbdSlm2AVBLHMiKDLoiCupEJv0BlKJs20oUlmSDJiGeYF3Pgqblwo4ta9O9/GtJ2Ftv4Q+PKfc0jOH8SMKu0431ZhYXFpeaW4Wlpb39jcsrd3mipKJCYNHLFItgOkCKOCNDTVjLRjSRAPGGkFw6txvXVPpKKRqOtRTHyO+oKGFCNtrK594MIT6IUS4dTjQfSQ1m+zLMfrGjyG43vXLjsVZyI4D24OZZCr1rW/vF6EE06Exgwp1XGdWPspkppiRrKSlygSIzxEfdIxKBAnyk8n22Tw0Dg9GEbSHKHhxP09kSKu1IgHppMjPVCztbH5X62T6PDCT6mIE00Enj4UJgzqCI6jgT0qCdZsZABhSc1fIR4gE402AZZMCO7syvPQPK24TsW9OytXL/M4imAP7IMj4IJzUAU3oAYaAINH8AxewZv1ZL1Y79bHtLVg5TO74I+szx8ZwZry</latexit><latexit sha1_base64="FfpNUyvhzZ48h+ueQ2Hy3cgzX50=">AAACDXicbZDLSgMxFIYz9VbrbdSlm2AVBLHMiKDLoiCupEJv0BlKJs20oUlmSDJiGeYF3Pgqblwo4ta9O9/GtJ2Ftv4Q+PKfc0jOH8SMKu0431ZhYXFpeaW4Wlpb39jcsrd3mipKJCYNHLFItgOkCKOCNDTVjLRjSRAPGGkFw6txvXVPpKKRqOtRTHyO+oKGFCNtrK594MIT6IUS4dTjQfSQ1m+zLMfrGjyG43vXLjsVZyI4D24OZZCr1rW/vF6EE06Exgwp1XGdWPspkppiRrKSlygSIzxEfdIxKBAnyk8n22Tw0Dg9GEbSHKHhxP09kSKu1IgHppMjPVCztbH5X62T6PDCT6mIE00Enj4UJgzqCI6jgT0qCdZsZABhSc1fIR4gE402AZZMCO7syvPQPK24TsW9OytXL/M4imAP7IMj4IJzUAU3oAYaAINH8AxewZv1ZL1Y79bHtLVg5TO74I+szx8ZwZry</latexit><latexit sha1_base64="FfpNUyvhzZ48h+ueQ2Hy3cgzX50=">AAACDXicbZDLSgMxFIYz9VbrbdSlm2AVBLHMiKDLoiCupEJv0BlKJs20oUlmSDJiGeYF3Pgqblwo4ta9O9/GtJ2Ftv4Q+PKfc0jOH8SMKu0431ZhYXFpeaW4Wlpb39jcsrd3mipKJCYNHLFItgOkCKOCNDTVjLRjSRAPGGkFw6txvXVPpKKRqOtRTHyO+oKGFCNtrK594MIT6IUS4dTjQfSQ1m+zLMfrGjyG43vXLjsVZyI4D24OZZCr1rW/vF6EE06Exgwp1XGdWPspkppiRrKSlygSIzxEfdIxKBAnyk8n22Tw0Dg9GEbSHKHhxP09kSKu1IgHppMjPVCztbH5X62T6PDCT6mIE00Enj4UJgzqCI6jgT0qCdZsZABhSc1fIR4gE402AZZMCO7syvPQPK24TsW9OytXL/M4imAP7IMj4IJzUAU3oAYaAINH8AxewZv1ZL1Y79bHtLVg5TO74I+szx8ZwZry</latexit>

Not so fast!
Proﬁt!Adapt!

What not to Do (I)
Show the adversary what your limits are
Message 500 people
Message 400 people
Message 300 people
🛑
🛑
✅

• Introduce delay in blocking
response (and/or) 
• Undo the damage without
telling the user.
What to do (I)
Don't give immediate feedback

"We don't want to be the ones solving the CAPTCHAs"
What not to Do (II)
Look for speciﬁc content to block

What to Do (II)
Focus on bad behavior, not only bad content

What to Do (III)
Use data the adversary doesn't know/control

Scoring at Entry Points
prevent access to accounts
Clustering, Anomaly Detection
prevent accounts from doing damage
User Reporting
ﬁnd false negatives
Behavioral Analysis
detect bad activityIncreasing
speed
More
information
available
What to Do (IV)
Defense in depth

• Think about each step of the ML process.
• It's hard to build a good training set.
• Adversarial adaptation breaks many assumptions.
• Control the data & the response.
Take aways
Thanks to: Hervé Robert, Isaac Fullinwider, Henry Lu, Sagar Patel, Hongyang Li, Nektarios Leontiadis

"You can't just turn the crank": Machine learning for fighting abuse on the consumer web

More Related Content

What's hot (20)

Similar to "You can't just turn the crank": Machine learning for fighting abuse on the consumer web (20)

Recently uploaded (20)

"You can't just turn the crank": Machine learning for fighting abuse on the consumer web