SlideShare a Scribd company logo
Graph Gurus Episode 3
Detecting Fraud and Money Laundering
In Real Time with a Graph DB, Part 1
© 2018 TigerGraph. All Rights Reserved
Welcome
● Attendees are muted but you can talk to us via Chat in Zoom
● We will have 10 min for Q&A at the end
● Send questions at any time using the Q&A tab in the Zoom menu
● The webinar will be recorded
● A link to the presentation and reproducible steps will be emailed
2
Developer Edition Download https://www.tigergraph.com/developer/
© 2018 TigerGraph. All Rights Reserved
Today’s Moderator
● BS in Electrical Engineering and Computer
Science from UC Berkeley
● MS in Electrical Engineering from Stanford
University
● PhD in Computer Science from Kent State
University focused on graph data mining
● 15+ years in tech industry
3
Dr. Victor Lee, Director of Product Management
© 2018 TigerGraph. All Rights Reserved
Today’s Guru
● BS & MS in Physics from University of
Science and Technology of China (USTC)
● PhD in Quantum Computation from
University of California, Merced
● 3-Year TigerGraph Veteran
● Solution Architect, Graph Query Language
Designer, Database Core Engineer
4
Dr. Dan Hu, Distinguished AI Research Scientist
© 2018 TigerGraph. All Rights
Reserved
5
Real-Time Phone-Based Fraud Detection
Massive, Worldwide Problem
● 18 Billion robocalls in US in 2017 (hiya.com)
● Spam/Scam - agile, spoofed numbers
Customer:
● 600M subscribers
● 300M calls/day, peak 10K calls/sec
● Need: Real-time detection of various
types of phone-based fraud
© 2018 TigerGraph. All Rights Reserved 6
Real-Time Phone Anti-Spam/Scam Detection
TigerGraph Solution: Real-time graph-based machine learning and
decision system
Graph Analytics
● Real-Time Machine Learning
○ 118 graph features per call
○ Retrained periodically with 2M calls
● Real-Time Decisions
○ Call recipient sees alert if ML system says
call is suspicious
● In production since Dec 2016
Graph Database
● 600M phone numbers
(inside and outside network)
● 15B phone-phone call edges
(2 month sliding window)
○ Time
○ Duration
● Real-time graph updates
Peak 10K+ calls/sec
○ 118 graph features per phone
© 2018 TigerGraph. All Rights Reserved
Examples of Graph Features for Machine Learning
7
Good Phone
Features
Bad Phone
Features
(1) Short term call
duration
(2) Empty stable group
(3) No call back phone
(4) Many rejected calls
(5) Average distance > 3
Empty stable group
Many rejected
calls
Average distance
> 3
(1) High call back
phone
(2) Stable group
(3) Long term phone
(4) Many in-group
connections
(5) 3-step friend relation
Stable
group
Many
in-group
connections
Good Phone
Features
3-step friend
relation
///
Good phone Bad phone
X
X
X
© 2018 TigerGraph. All Rights Reserved
China Mobile - Detecting Phone-Based Fraud by
Analyzing Network or Graph Relationship Features
8
● Each phone node has a fraud flag,
indicating it’s a good phone or a bad
phone and what type (scam, harassment,
advertisement).
● Run real-time GSQL query for each call:
○ Collect 118 features
○ Compute composite score
○ Update fraud flag
○ Return fraud type
Machine Learning with TigerGraph In Depth
China Mobile Anti-Fraud/Scam Detection
© 2018 TigerGraph. All Rights Reserved
Phone Fraud Real-Time Detection System
phone vertex
- fraud flag
- expiration time
target4
target3
- num of call
- total duration
- call date list
- num of rejection
target2
target1
● 600 Million Vertices
● 15+ Billion Edges
● 300 Million Daily
Updatesphone_phone
© 2018 TigerGraph. All Rights Reserved
Case 1: Call Type was recently flagged
© 2018 TigerGraph. All Rights Reserved
Case 2: Call needs to be classified
© 2018 TigerGraph. All Rights Reserved
Machine Learning with TigerGraph
Real-time Scoring with Multiple ML models in GSQL
• Why TigerGraph?
• Fast: Real-time response for both feature collection and scoring.
• Efficient: Supports aggregation during traversal, multiple features in one.
• Easy: Easy to collect complex features (RDBMS needs multi-join).
• GSQL/TigerGraph collects 118 graph features and performs fraud scoring with
multiple Machine Learning models in real time.
• logistic regression
• K-clustering
• ML models are trained offline; ML model parameters stored as configuration
files for GSQL to use for real-time scoring.
© 2018 TigerGraph. All Rights Reserved
China Mobile Machine Learning Model Training
• Data labels were obtained from police reports and from
online third party sources.
• 118 graph features analyzed to build fraud detection
model. All features collected by one GSQL query.
• Training data’s features collected in GSQL in batch
processing and stored as CSV file for future Model Training.
© 2018 TigerGraph. All Rights Reserved
Examples of Graph Features for Machine Learning
15
Good Phone
Features
Bad Phone
Features
(1) Short term call
duration
(2) Empty stable group
(3) No call back phone
(4) Many rejected calls
(5) Average distance > 3
Empty stable group
Many rejected
calls
Average distance
> 3
(1) High call back
phone
(2) Stable group
(3) Long term phone
(4) Many in-group
connections
(5) 3-step friend relation
Stable
group
Many
in-group
connections
Good Phone
Features
3-step friend
relation
///
Good phone Bad phone
X
X
X
© 2018 TigerGraph. All Rights Reserved
Graph Features: Stable Group & InGroup Connection
• Stable Group: phones in the target group that have regular
calls (stable connection) with source phone
• Stable InGroup Connections: phones in the target group that
have regular calls (stable connection) among themselves
Stable Connection defined as
● Has both Call and Callback
● Num of Call is larger than a given limit
● Total Duration is larger than a given limit
© 2018 TigerGraph. All Rights Reserved
Stable Group Pseudocode
• Step 1: Starting from the given phone
vertex, find its 1-step neighbors.
• Step 2: Check if a target has both
stable outgoing (phone_phone) and
stable incoming edges
(phone_phone_reversed).
source
target4
target3
- num of call
- total duration
- call date list
- num of rejection
target2
target1
phone_phone
phone_phone
phone_phone_reversed
Stable Connection defined as
● Has both Call and Callback
● Num of Call is larger than a given limit
● Total Duration is larger than a given limit
source
© 2018 TigerGraph. All Rights Reserved
Stable InGroup Connections Pseudocode
• Step 1: Starting from the given phone
vertex, find its 1-step neighbors (target
group).
• Step 2: For each vertex in the target
group, find its 1-step neighbors and
check for stable connections.
• Step 3: Check the stable target for
each vertex in the target group
source
target4
target3
- num of call
- total duration
- call date list
- num of rejection
target2
target1phone_phone
phone_phone
phone_phone_reversed
source
Stable Connection defined as
● Has both Call and Callback
● Num of Call is larger than a given limit
● Total Duration is larger than a given limit
© 2018 TigerGraph. All Rights Reserved
GSQL DEMO
http://192.168.55.50:14240/#/query-editor
https://github.com/tigergraph/ecosys/tree/master/guru_scripts
/fraud_detection_demo
Q&A
Please send your questions via the Q&A menu in Zoom
20
© 2018 TigerGraph. All Rights Reserved
Episode 4: Sept 26, 2018
Detecting Fraud and Money Laundering in Real-Time with a Graph DB,
Part 2
https://info.tigergraph.com/graph-gurus-4
21
REGISTER FOR MORE
WEBINARS AT
https://www.tigergraph.com/
webinars-and-events/
© 2018 TigerGraph. All Rights Reserved
Additional Resources
22
Compare the Developer Edition and Enterprise Free Trial
https://www.tigergraph.com/download/
Guru Scripts
https://github.com/tigergraph/ecosys/tree/master/guru_scripts
Join our Developer Forum
https://groups.google.com/a/opengsql.org/forum/#!forum/gsql-users
Take the Developer Survey
https://www.tigergraph.com/developer-edition-feedback-survey/
@TigerGraphDB youtube.com/tigergraph facebook.com/TigerGraphDB linkedin.com/company/TigerGraph

More Related Content

What's hot (20)

PDF
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
Neo4j
 
PDF
Bias in AI-systems: A multi-step approach
Eirini Ntoutsi
 
PDF
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
ssuser4edc93
 
PDF
Graph Gurus 23: Best Practices To Model Your Data Using A Graph Database
TigerGraph
 
PDF
AstraZeneca - Re-imagining the Data Landscape in Compound Synthesis & Management
Neo4j
 
PDF
Graph Gurus Episode 11: Accumulators for Complex Graph Analytics
TigerGraph
 
PDF
Introduction to Data Mining and Big Data Analytics
Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University
 
PDF
Graph Gurus 15: Introducing TigerGraph 2.4
TigerGraph
 
PDF
Danish Business Authority: Explainability and causality in relation to ML Ops
Neo4j
 
PDF
Data Pipline Observability meetup
Omid Vahdaty
 
PDF
An Introduction to Generative AI - May 18, 2023
CoriFaklaris1
 
PDF
Recommender Systems In Industry
Xavier Amatriain
 
PPTX
Future of AI - 2023 07 25.pptx
Greg Makowski
 
PDF
Enterprise Knowledge Graph
Benjamin Raethlein
 
PDF
Modern Data Challenges require Modern Graph Technology
Neo4j
 
PPTX
Optimizing Your Supply Chain with Neo4j
Neo4j
 
PDF
Build User-Facing Analytics Application That Scales Using StarRocks (DLH).pdf
Albert Wong
 
PPTX
Volvo Cars - Retrieving Safety Insights using Graphs (GraphSummit Stockholm 2...
Neo4j
 
PDF
Introduction to Knowledge Graphs: Data Summit 2020
Enterprise Knowledge
 
PDF
Building a modern data stack to maintain an efficient and safe electrical grid
Neo4j
 
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
Neo4j
 
Bias in AI-systems: A multi-step approach
Eirini Ntoutsi
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
ssuser4edc93
 
Graph Gurus 23: Best Practices To Model Your Data Using A Graph Database
TigerGraph
 
AstraZeneca - Re-imagining the Data Landscape in Compound Synthesis & Management
Neo4j
 
Graph Gurus Episode 11: Accumulators for Complex Graph Analytics
TigerGraph
 
Introduction to Data Mining and Big Data Analytics
Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University
 
Graph Gurus 15: Introducing TigerGraph 2.4
TigerGraph
 
Danish Business Authority: Explainability and causality in relation to ML Ops
Neo4j
 
Data Pipline Observability meetup
Omid Vahdaty
 
An Introduction to Generative AI - May 18, 2023
CoriFaklaris1
 
Recommender Systems In Industry
Xavier Amatriain
 
Future of AI - 2023 07 25.pptx
Greg Makowski
 
Enterprise Knowledge Graph
Benjamin Raethlein
 
Modern Data Challenges require Modern Graph Technology
Neo4j
 
Optimizing Your Supply Chain with Neo4j
Neo4j
 
Build User-Facing Analytics Application That Scales Using StarRocks (DLH).pdf
Albert Wong
 
Volvo Cars - Retrieving Safety Insights using Graphs (GraphSummit Stockholm 2...
Neo4j
 
Introduction to Knowledge Graphs: Data Summit 2020
Enterprise Knowledge
 
Building a modern data stack to maintain an efficient and safe electrical grid
Neo4j
 

Similar to Graph Gurus Episode 3: Anti Fraud and AML Part 1 (20)

PDF
Graph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AI
TigerGraph
 
PDF
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Databricks
 
PDF
Scaling up business value with real-time operational graph analytics
Connected Data World
 
PDF
Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...
TigerGraph
 
PDF
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
TigerGraph
 
PDF
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
Shift Conference
 
PDF
Fraud prevention is better with TigerGraph inside
TigerGraph
 
PPTX
Phone Fraud Detection
Sri Kanajan
 
PDF
Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...
TigerGraph
 
PDF
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
Connected Data World
 
PDF
Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...
TigerGraph
 
PDF
Graph Gurus Episode 25: Unleash the Business Value of Your Data Lake with Gra...
TigerGraph
 
PDF
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
TigerGraph
 
PDF
Graph+AI for Fin. Services
TigerGraph
 
PDF
Graph Gurus Episode 12: Tiger Graph v2.3 Overview
TigerGraph
 
PDF
Graph Gurus Episode 22: Guarding Against Cyber Security Threats with a Graph ...
Amanda Morris
 
PDF
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
TigerGraph
 
PDF
Graph Gurus Episode 31: GSQL Writing Best Practices Part 1
TigerGraph
 
PDF
Graph Gurus Episode 22: Cybersecurity
TigerGraph
 
PDF
TigerGraph UI Toolkits Financial Crimes
TigerGraph
 
Graph Gurus 21: Integrating Real-Time Deep-Link Graph Analytics with Spark AI
TigerGraph
 
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Databricks
 
Scaling up business value with real-time operational graph analytics
Connected Data World
 
Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...
TigerGraph
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
TigerGraph
 
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
Shift Conference
 
Fraud prevention is better with TigerGraph inside
TigerGraph
 
Phone Fraud Detection
Sri Kanajan
 
Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...
TigerGraph
 
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
Connected Data World
 
Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...
TigerGraph
 
Graph Gurus Episode 25: Unleash the Business Value of Your Data Lake with Gra...
TigerGraph
 
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
TigerGraph
 
Graph+AI for Fin. Services
TigerGraph
 
Graph Gurus Episode 12: Tiger Graph v2.3 Overview
TigerGraph
 
Graph Gurus Episode 22: Guarding Against Cyber Security Threats with a Graph ...
Amanda Morris
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
TigerGraph
 
Graph Gurus Episode 31: GSQL Writing Best Practices Part 1
TigerGraph
 
Graph Gurus Episode 22: Cybersecurity
TigerGraph
 
TigerGraph UI Toolkits Financial Crimes
TigerGraph
 
Ad

More from TigerGraph (20)

PDF
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
TigerGraph
 
PDF
Better Together: How Graph database enables easy data integration with Spark ...
TigerGraph
 
PDF
Building an accurate understanding of consumers based on real-world signals
TigerGraph
 
PDF
Care Intervention Assistant - Omaha Clinical Data Information System
TigerGraph
 
PDF
Correspondent Banking Networks
TigerGraph
 
PDF
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
TigerGraph
 
PDF
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
TigerGraph
 
PDF
Fraud Detection and Compliance with Graph Learning
TigerGraph
 
PDF
Fraudulent credit card cash-out detection On Graphs
TigerGraph
 
PDF
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
TigerGraph
 
PDF
Customer Experience Management
TigerGraph
 
PDF
Davraz - A graph visualization and exploration software.
TigerGraph
 
PDF
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 
PDF
TigerGraph.js
TigerGraph
 
PDF
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
TigerGraph
 
PDF
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
TigerGraph
 
PDF
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
TigerGraph
 
PDF
Recommendation Engine with In-Database Machine Learning
TigerGraph
 
PDF
Supply Chain and Logistics Management with Graph & AI
TigerGraph
 
PDF
The key to creating a Golden Thread: the power of Graph Databases for Entity ...
TigerGraph
 
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
TigerGraph
 
Better Together: How Graph database enables easy data integration with Spark ...
TigerGraph
 
Building an accurate understanding of consumers based on real-world signals
TigerGraph
 
Care Intervention Assistant - Omaha Clinical Data Information System
TigerGraph
 
Correspondent Banking Networks
TigerGraph
 
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
TigerGraph
 
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
TigerGraph
 
Fraud Detection and Compliance with Graph Learning
TigerGraph
 
Fraudulent credit card cash-out detection On Graphs
TigerGraph
 
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
TigerGraph
 
Customer Experience Management
TigerGraph
 
Davraz - A graph visualization and exploration software.
TigerGraph
 
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 
TigerGraph.js
TigerGraph
 
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
TigerGraph
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
TigerGraph
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
TigerGraph
 
Recommendation Engine with In-Database Machine Learning
TigerGraph
 
Supply Chain and Logistics Management with Graph & AI
TigerGraph
 
The key to creating a Golden Thread: the power of Graph Databases for Entity ...
TigerGraph
 
Ad

Recently uploaded (20)

PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
PDF
UITP Summit Meep Pitch may 2025 MaaS Rebooted
campoamor1
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
PDF
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
PDF
ERP Consulting Services and Solutions by Contetra Pvt Ltd
jayjani123
 
PPTX
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PDF
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PPTX
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
PPTX
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PDF
Add Background Images to Charts in IBM SPSS Statistics Version 31.pdf
Version 1 Analytics
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
UITP Summit Meep Pitch may 2025 MaaS Rebooted
campoamor1
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
ERP Consulting Services and Solutions by Contetra Pvt Ltd
jayjani123
 
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Add Background Images to Charts in IBM SPSS Statistics Version 31.pdf
Version 1 Analytics
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 

Graph Gurus Episode 3: Anti Fraud and AML Part 1

  • 1. Graph Gurus Episode 3 Detecting Fraud and Money Laundering In Real Time with a Graph DB, Part 1
  • 2. © 2018 TigerGraph. All Rights Reserved Welcome ● Attendees are muted but you can talk to us via Chat in Zoom ● We will have 10 min for Q&A at the end ● Send questions at any time using the Q&A tab in the Zoom menu ● The webinar will be recorded ● A link to the presentation and reproducible steps will be emailed 2 Developer Edition Download https://www.tigergraph.com/developer/
  • 3. © 2018 TigerGraph. All Rights Reserved Today’s Moderator ● BS in Electrical Engineering and Computer Science from UC Berkeley ● MS in Electrical Engineering from Stanford University ● PhD in Computer Science from Kent State University focused on graph data mining ● 15+ years in tech industry 3 Dr. Victor Lee, Director of Product Management
  • 4. © 2018 TigerGraph. All Rights Reserved Today’s Guru ● BS & MS in Physics from University of Science and Technology of China (USTC) ● PhD in Quantum Computation from University of California, Merced ● 3-Year TigerGraph Veteran ● Solution Architect, Graph Query Language Designer, Database Core Engineer 4 Dr. Dan Hu, Distinguished AI Research Scientist
  • 5. © 2018 TigerGraph. All Rights Reserved 5 Real-Time Phone-Based Fraud Detection Massive, Worldwide Problem ● 18 Billion robocalls in US in 2017 (hiya.com) ● Spam/Scam - agile, spoofed numbers Customer: ● 600M subscribers ● 300M calls/day, peak 10K calls/sec ● Need: Real-time detection of various types of phone-based fraud
  • 6. © 2018 TigerGraph. All Rights Reserved 6 Real-Time Phone Anti-Spam/Scam Detection TigerGraph Solution: Real-time graph-based machine learning and decision system Graph Analytics ● Real-Time Machine Learning ○ 118 graph features per call ○ Retrained periodically with 2M calls ● Real-Time Decisions ○ Call recipient sees alert if ML system says call is suspicious ● In production since Dec 2016 Graph Database ● 600M phone numbers (inside and outside network) ● 15B phone-phone call edges (2 month sliding window) ○ Time ○ Duration ● Real-time graph updates Peak 10K+ calls/sec ○ 118 graph features per phone
  • 7. © 2018 TigerGraph. All Rights Reserved Examples of Graph Features for Machine Learning 7 Good Phone Features Bad Phone Features (1) Short term call duration (2) Empty stable group (3) No call back phone (4) Many rejected calls (5) Average distance > 3 Empty stable group Many rejected calls Average distance > 3 (1) High call back phone (2) Stable group (3) Long term phone (4) Many in-group connections (5) 3-step friend relation Stable group Many in-group connections Good Phone Features 3-step friend relation /// Good phone Bad phone X X X
  • 8. © 2018 TigerGraph. All Rights Reserved China Mobile - Detecting Phone-Based Fraud by Analyzing Network or Graph Relationship Features 8 ● Each phone node has a fraud flag, indicating it’s a good phone or a bad phone and what type (scam, harassment, advertisement). ● Run real-time GSQL query for each call: ○ Collect 118 features ○ Compute composite score ○ Update fraud flag ○ Return fraud type
  • 9. Machine Learning with TigerGraph In Depth China Mobile Anti-Fraud/Scam Detection
  • 10. © 2018 TigerGraph. All Rights Reserved Phone Fraud Real-Time Detection System phone vertex - fraud flag - expiration time target4 target3 - num of call - total duration - call date list - num of rejection target2 target1 ● 600 Million Vertices ● 15+ Billion Edges ● 300 Million Daily Updatesphone_phone
  • 11. © 2018 TigerGraph. All Rights Reserved Case 1: Call Type was recently flagged
  • 12. © 2018 TigerGraph. All Rights Reserved Case 2: Call needs to be classified
  • 13. © 2018 TigerGraph. All Rights Reserved Machine Learning with TigerGraph Real-time Scoring with Multiple ML models in GSQL • Why TigerGraph? • Fast: Real-time response for both feature collection and scoring. • Efficient: Supports aggregation during traversal, multiple features in one. • Easy: Easy to collect complex features (RDBMS needs multi-join). • GSQL/TigerGraph collects 118 graph features and performs fraud scoring with multiple Machine Learning models in real time. • logistic regression • K-clustering • ML models are trained offline; ML model parameters stored as configuration files for GSQL to use for real-time scoring.
  • 14. © 2018 TigerGraph. All Rights Reserved China Mobile Machine Learning Model Training • Data labels were obtained from police reports and from online third party sources. • 118 graph features analyzed to build fraud detection model. All features collected by one GSQL query. • Training data’s features collected in GSQL in batch processing and stored as CSV file for future Model Training.
  • 15. © 2018 TigerGraph. All Rights Reserved Examples of Graph Features for Machine Learning 15 Good Phone Features Bad Phone Features (1) Short term call duration (2) Empty stable group (3) No call back phone (4) Many rejected calls (5) Average distance > 3 Empty stable group Many rejected calls Average distance > 3 (1) High call back phone (2) Stable group (3) Long term phone (4) Many in-group connections (5) 3-step friend relation Stable group Many in-group connections Good Phone Features 3-step friend relation /// Good phone Bad phone X X X
  • 16. © 2018 TigerGraph. All Rights Reserved Graph Features: Stable Group & InGroup Connection • Stable Group: phones in the target group that have regular calls (stable connection) with source phone • Stable InGroup Connections: phones in the target group that have regular calls (stable connection) among themselves Stable Connection defined as ● Has both Call and Callback ● Num of Call is larger than a given limit ● Total Duration is larger than a given limit
  • 17. © 2018 TigerGraph. All Rights Reserved Stable Group Pseudocode • Step 1: Starting from the given phone vertex, find its 1-step neighbors. • Step 2: Check if a target has both stable outgoing (phone_phone) and stable incoming edges (phone_phone_reversed). source target4 target3 - num of call - total duration - call date list - num of rejection target2 target1 phone_phone phone_phone phone_phone_reversed Stable Connection defined as ● Has both Call and Callback ● Num of Call is larger than a given limit ● Total Duration is larger than a given limit source
  • 18. © 2018 TigerGraph. All Rights Reserved Stable InGroup Connections Pseudocode • Step 1: Starting from the given phone vertex, find its 1-step neighbors (target group). • Step 2: For each vertex in the target group, find its 1-step neighbors and check for stable connections. • Step 3: Check the stable target for each vertex in the target group source target4 target3 - num of call - total duration - call date list - num of rejection target2 target1phone_phone phone_phone phone_phone_reversed source Stable Connection defined as ● Has both Call and Callback ● Num of Call is larger than a given limit ● Total Duration is larger than a given limit
  • 19. © 2018 TigerGraph. All Rights Reserved GSQL DEMO http://192.168.55.50:14240/#/query-editor https://github.com/tigergraph/ecosys/tree/master/guru_scripts /fraud_detection_demo
  • 20. Q&A Please send your questions via the Q&A menu in Zoom 20
  • 21. © 2018 TigerGraph. All Rights Reserved Episode 4: Sept 26, 2018 Detecting Fraud and Money Laundering in Real-Time with a Graph DB, Part 2 https://info.tigergraph.com/graph-gurus-4 21 REGISTER FOR MORE WEBINARS AT https://www.tigergraph.com/ webinars-and-events/
  • 22. © 2018 TigerGraph. All Rights Reserved Additional Resources 22 Compare the Developer Edition and Enterprise Free Trial https://www.tigergraph.com/download/ Guru Scripts https://github.com/tigergraph/ecosys/tree/master/guru_scripts Join our Developer Forum https://groups.google.com/a/opengsql.org/forum/#!forum/gsql-users Take the Developer Survey https://www.tigergraph.com/developer-edition-feedback-survey/ @TigerGraphDB youtube.com/tigergraph facebook.com/TigerGraphDB linkedin.com/company/TigerGraph