SlideShare a Scribd company logo
Better Together: How Graph
database enables easy data
integration with Spark and
Kafka in the Cloud
September 30th 2020
1
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Today's Speakers
Emma Liu
Product Manager
● BS in Engineering from Harvey Mudd College, MS
in Engineering Systems from MIT
● Prior work experience at Oracle and MarkLogic
● Focus - Cloud, Containers, Enterprise Infra,
Monitoring, Management, Connectors
Rayees Pasha
Product Manager
● MS in Computer Science from University of Memphis
● Prior Lead PM and ENG positions at Workday, Hitachi
and HP
● Expertise in Database Management and Big Data
Technologies
2
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
1
TigerGraph Architecture and Data
Ingestion Overview
TigerGraph and Spark Data Pipeline
TigerGraph and Kafka Data Pipeline
Today’s Outline
3
2
3
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
SYSTEM
ARCHITECTURE
OVERVIEW
4
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
The TigerGraph Difference
Feature Design Difference Benefit
Real-Time Deep-Link Querying ● Native Graph design
● C++ engine, for high performance
● Storage Architecture
● Uncovers hard-to-find patterns
● Operational, real-time
● HTAP: Transactions+Analytics
Handling Massive Scale ● Distributed DB architecture
● Massively parallel processing
● Compressed storage reduces
footprint and messaging
● Integrates all your data
● Automatic partitioning
● Elastic scaling of resource usage
In-Database Analytics ● GSQL: High-level yet
Turing-complete language
● User-extensible graph algorithm
library, runs in-DB
● ACID (OLTP) and Accumulators
(OLAP)
● Avoids transferring data
● Richer graph context
● In-DB machine learning
5 to 10+ hops deep
5
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
TigerGraph Architecture
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Data Ingestion
7
Step 3
Each GPE consumes the
partial data updates,
processes it and puts it on
disk.
Loading Jobs and POST use
UPSERT semantics:
● If vertex/edge doesn't
yet exist, create it.
● If vertex/edge already
exists, update it.
● Idempotent
Step 1
Data integration through the
following ways to ingest in
user source data.
● Bulk load of data files or
a Kafka stream in CSV or
JSON format
● HTTP POSTs via REST
services (JSON)
● GSQL Insert commands
Step 2
Dispatcher takes in the data
ingestion requests in the form of
updates to the database.
1. Query IDS to get internal
IDs
2. Convert data to internal
format
3. Send data to one or more
corresponding GPEs
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Data Ingestion
8
Incremental
Data
Nginx Restpp
GPE GPE GPE
Disk Disk Disk
CSV/JSON Insert/Update/Delete
Vertices and Edges
Listen to
corresponding
topic for new
messages
Acknowledge
Response
Incoming
Outgoing
Synchronize
data to disk
GSE(IDS)
ID Translation
Kafka Kafka Kafka
Server 1 Server 2 Server 3
Kafka Cluster
In-memory
copy of data
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Spark and
TigerGraph
9
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Spark + TigerGraph Data Pipeline
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Typical Spark + TigerGraph Integration
● Data Preparation and Integration (TigerGraph/Spark)
● Unsupervised Learning (TigerGraph)
● Feature Extraction for Supervised Learning (TigerGraph/Spark)
● Model Training (Spark)
● Validate and Apply Model (TigerGraph)
● Visualize and Explore Interconnected Data (TigerGraph)
11
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Spark and TigerGraph Data Pipeline
Static
Data
Sources
TigerGraph
JDBC
Driver
Streaming
Data
Sources
12
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
JDBC Driver
● Type 4 driver
● Support Read and Write bi-directional data flow to TigerGraph
● Read: Converts ResultSet to DataFrame
● Write: Load DataFrame and files to vertex/edge in TigerGraph
● Supports REST endpoints of built-in, compiled and interpreted GSQL queries from
TigerGraph
● Open Source:
● https://github.com/tigergraph/ecosys/tree/master/tools/etl/tg-jdbc-driver
13
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Supervised ML with TigerGraph - Detecting Phone-Based Fraud
by Analyzing Network or Graph Relationship Features at China
Mobile
Download the solution brief at - https://info.tigergraph.com/MachineLearning
14
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
DEMO
15
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka and
TigerGraph
16
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka and TigerGraph Data Pipeline
Static
Data
Sources
Streaming
Data
Sources
Kafka
Loader
17
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka Loader - Speed to Value from Real-time
Streaming Data
• Reduce Data Availability Gap and Accelerate Time to Value
• Native Integration with Real-time Streaming Data and Batch
Data
• Enables Real-time Graph Feature Updates with Streaming Data
in Machine Learning Use Cases
• Decrease Learning Curve With Familiar Syntax
• GSQL Support with Consistent Data Loading Syntax
• Maintain Separation of Control for Data Loading
• Designed with Built-in MultiGraph Support
18
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka Loader : Three Steps
Consistent with GSQL Data Loading Steps
Step 1: Define the Data Source
Step 2: Create a Loading Job
Step 3: Run the Loading Job
19
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Kafka Loader High Level Architecture
● Connect to External Kafka Cluster
● User Commands Through GSQL Server
● Configuration Settings:
○ Config 1: Kakfa Cluster Configuration
○ Config 2: Topic/Partition/Offset Info
20
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
DEMO
21
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
TigerGraph Architecture + Spark + Kakfa
22
Get Started for Free
● Try TigerGraph Cloud ( tgcloud.io )
● Download TigerGraph’s Developer Edition
● Take a Test Drive - Online Demo
● Get TigerGraph Certified
● Join the Community
@TigerGraphDB /tigergraph /TigerGraphDB /company/TigerGraph
23

More Related Content

What's hot (20)

PDF
Data lineage and observability with Marquez - subsurface 2020
Julien Le Dem
 
PPTX
Introduction to Graph Databases
Max De Marzi
 
PDF
Intro to Neo4j and Graph Databases
Neo4j
 
PPTX
Smarter Fraud Detection With Graph Data Science
Neo4j
 
PDF
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
Neo4j
 
PPTX
Intro to Neo4j
Neo4j
 
PDF
Data Modeling with Neo4j
Neo4j
 
PPTX
Data Warehousing Trends, Best Practices, and Future Outlook
James Serra
 
PDF
DATA & ANALYTICS
fireflylabz
 
PDF
Graph based data models
Moumie Soulemane
 
PDF
Using an employee knowledge graph for employee engagement and career mobility
Neo4j
 
PPTX
Getting Started with Geospatial Data in MongoDB
MongoDB
 
PDF
Introduction of Knowledge Graphs
Jeff Z. Pan
 
PDF
How to Build a Fraud Detection Solution with Neo4j
Neo4j
 
PDF
Neo4j Webinar: Graphs in banking
Neo4j
 
PPTX
Mongodb basics and architecture
Bishal Khanal
 
PDF
Modern Data Challenges require Modern Graph Technology
Neo4j
 
PPTX
The openCypher Project - An Open Graph Query Language
Neo4j
 
PDF
Data Mesh
Piethein Strengholt
 
ODP
Introduction to MongoDB
Dineesha Suraweera
 
Data lineage and observability with Marquez - subsurface 2020
Julien Le Dem
 
Introduction to Graph Databases
Max De Marzi
 
Intro to Neo4j and Graph Databases
Neo4j
 
Smarter Fraud Detection With Graph Data Science
Neo4j
 
The perfect couple: Uniting Large Language Models and Knowledge Graphs for En...
Neo4j
 
Intro to Neo4j
Neo4j
 
Data Modeling with Neo4j
Neo4j
 
Data Warehousing Trends, Best Practices, and Future Outlook
James Serra
 
DATA & ANALYTICS
fireflylabz
 
Graph based data models
Moumie Soulemane
 
Using an employee knowledge graph for employee engagement and career mobility
Neo4j
 
Getting Started with Geospatial Data in MongoDB
MongoDB
 
Introduction of Knowledge Graphs
Jeff Z. Pan
 
How to Build a Fraud Detection Solution with Neo4j
Neo4j
 
Neo4j Webinar: Graphs in banking
Neo4j
 
Mongodb basics and architecture
Bishal Khanal
 
Modern Data Challenges require Modern Graph Technology
Neo4j
 
The openCypher Project - An Open Graph Query Language
Neo4j
 
Introduction to MongoDB
Dineesha Suraweera
 

Similar to Better Together: How Graph database enables easy data integration with Spark and Kafka in the Cloud (20)

PDF
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
HostedbyConfluent
 
PDF
RAPIDS cuGraph – Accelerating all your Graph needs
Connected Data World
 
PPTX
Comparing three data ingestion approaches where Apache Kafka integrates with ...
HostedbyConfluent
 
PDF
NVIDIA Rapids presentation
testSri1
 
PDF
Rapids: Data Science on GPUs
inside-BigData.com
 
PDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
Keith Kraus
 
PPTX
Deploying Data Science Engines to Production
Mostafa Majidpour
 
PDF
What's New in Upcoming Apache Spark 2.3
Databricks
 
PDF
RAPIDS – Open GPU-accelerated Data Science
Data Works MD
 
PDF
GOAI: GPU-Accelerated Data Science DataSciCon 2017
Joshua Patterson
 
PDF
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
cscpconf
 
PDF
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Jason Dai
 
PDF
2018 02-08-what's-new-in-apache-spark-2.3
Chester Chen
 
PDF
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
csandit
 
PDF
PNDA - Platform for Network Data Analytics
John Evans
 
PDF
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
VMware Tanzu
 
PDF
Cloud-Native Patterns for Data-Intensive Applications
VMware Tanzu
 
PDF
apidays LIVE Paris - GraphQL meshes by Jens Neuse
apidays
 
PDF
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Databricks
 
PDF
RAPIDS Overview
NVIDIA Japan
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
HostedbyConfluent
 
RAPIDS cuGraph – Accelerating all your Graph needs
Connected Data World
 
Comparing three data ingestion approaches where Apache Kafka integrates with ...
HostedbyConfluent
 
NVIDIA Rapids presentation
testSri1
 
Rapids: Data Science on GPUs
inside-BigData.com
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
Keith Kraus
 
Deploying Data Science Engines to Production
Mostafa Majidpour
 
What's New in Upcoming Apache Spark 2.3
Databricks
 
RAPIDS – Open GPU-accelerated Data Science
Data Works MD
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
Joshua Patterson
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
cscpconf
 
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Jason Dai
 
2018 02-08-what's-new-in-apache-spark-2.3
Chester Chen
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
csandit
 
PNDA - Platform for Network Data Analytics
John Evans
 
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
VMware Tanzu
 
Cloud-Native Patterns for Data-Intensive Applications
VMware Tanzu
 
apidays LIVE Paris - GraphQL meshes by Jens Neuse
apidays
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Databricks
 
RAPIDS Overview
NVIDIA Japan
 
Ad

More from TigerGraph (20)

PDF
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
TigerGraph
 
PDF
Building an accurate understanding of consumers based on real-world signals
TigerGraph
 
PDF
Care Intervention Assistant - Omaha Clinical Data Information System
TigerGraph
 
PDF
Correspondent Banking Networks
TigerGraph
 
PDF
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
TigerGraph
 
PDF
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
TigerGraph
 
PDF
Fraud Detection and Compliance with Graph Learning
TigerGraph
 
PDF
Fraudulent credit card cash-out detection On Graphs
TigerGraph
 
PDF
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
TigerGraph
 
PDF
Customer Experience Management
TigerGraph
 
PDF
Graph+AI for Fin. Services
TigerGraph
 
PDF
Davraz - A graph visualization and exploration software.
TigerGraph
 
PDF
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 
PDF
TigerGraph.js
TigerGraph
 
PDF
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
TigerGraph
 
PDF
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
TigerGraph
 
PDF
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
TigerGraph
 
PDF
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
TigerGraph
 
PDF
Recommendation Engine with In-Database Machine Learning
TigerGraph
 
PDF
Supply Chain and Logistics Management with Graph & AI
TigerGraph
 
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
TigerGraph
 
Building an accurate understanding of consumers based on real-world signals
TigerGraph
 
Care Intervention Assistant - Omaha Clinical Data Information System
TigerGraph
 
Correspondent Banking Networks
TigerGraph
 
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
TigerGraph
 
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
TigerGraph
 
Fraud Detection and Compliance with Graph Learning
TigerGraph
 
Fraudulent credit card cash-out detection On Graphs
TigerGraph
 
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
TigerGraph
 
Customer Experience Management
TigerGraph
 
Graph+AI for Fin. Services
TigerGraph
 
Davraz - A graph visualization and exploration software.
TigerGraph
 
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 
TigerGraph.js
TigerGraph
 
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
TigerGraph
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
TigerGraph
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
TigerGraph
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
TigerGraph
 
Recommendation Engine with In-Database Machine Learning
TigerGraph
 
Supply Chain and Logistics Management with Graph & AI
TigerGraph
 
Ad

Recently uploaded (20)

PPTX
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
PPTX
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
PPTX
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
BinarySearchTree in datastructures in detail
kichokuttu
 
PDF
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
PPTX
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
PPTX
What Is Data Integration and Transformation?
subhashenia
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
PPTX
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
PDF
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
PPTX
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
PPTX
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
PDF
InformaticsPractices-MS - Google Docs.pdf
seshuashwin0829
 
PPTX
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
PDF
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
BinarySearchTree in datastructures in detail
kichokuttu
 
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
What Is Data Integration and Transformation?
subhashenia
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
Research Methodology Overview Introduction
ayeshagul29594
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
InformaticsPractices-MS - Google Docs.pdf
seshuashwin0829
 
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 

Better Together: How Graph database enables easy data integration with Spark and Kafka in the Cloud

  • 1. Better Together: How Graph database enables easy data integration with Spark and Kafka in the Cloud September 30th 2020 1
  • 2. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Today's Speakers Emma Liu Product Manager ● BS in Engineering from Harvey Mudd College, MS in Engineering Systems from MIT ● Prior work experience at Oracle and MarkLogic ● Focus - Cloud, Containers, Enterprise Infra, Monitoring, Management, Connectors Rayees Pasha Product Manager ● MS in Computer Science from University of Memphis ● Prior Lead PM and ENG positions at Workday, Hitachi and HP ● Expertise in Database Management and Big Data Technologies 2
  • 3. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | 1 TigerGraph Architecture and Data Ingestion Overview TigerGraph and Spark Data Pipeline TigerGraph and Kafka Data Pipeline Today’s Outline 3 2 3
  • 4. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | SYSTEM ARCHITECTURE OVERVIEW 4
  • 5. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | The TigerGraph Difference Feature Design Difference Benefit Real-Time Deep-Link Querying ● Native Graph design ● C++ engine, for high performance ● Storage Architecture ● Uncovers hard-to-find patterns ● Operational, real-time ● HTAP: Transactions+Analytics Handling Massive Scale ● Distributed DB architecture ● Massively parallel processing ● Compressed storage reduces footprint and messaging ● Integrates all your data ● Automatic partitioning ● Elastic scaling of resource usage In-Database Analytics ● GSQL: High-level yet Turing-complete language ● User-extensible graph algorithm library, runs in-DB ● ACID (OLTP) and Accumulators (OLAP) ● Avoids transferring data ● Richer graph context ● In-DB machine learning 5 to 10+ hops deep 5
  • 6. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | TigerGraph Architecture
  • 7. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Data Ingestion 7 Step 3 Each GPE consumes the partial data updates, processes it and puts it on disk. Loading Jobs and POST use UPSERT semantics: ● If vertex/edge doesn't yet exist, create it. ● If vertex/edge already exists, update it. ● Idempotent Step 1 Data integration through the following ways to ingest in user source data. ● Bulk load of data files or a Kafka stream in CSV or JSON format ● HTTP POSTs via REST services (JSON) ● GSQL Insert commands Step 2 Dispatcher takes in the data ingestion requests in the form of updates to the database. 1. Query IDS to get internal IDs 2. Convert data to internal format 3. Send data to one or more corresponding GPEs
  • 8. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Data Ingestion 8 Incremental Data Nginx Restpp GPE GPE GPE Disk Disk Disk CSV/JSON Insert/Update/Delete Vertices and Edges Listen to corresponding topic for new messages Acknowledge Response Incoming Outgoing Synchronize data to disk GSE(IDS) ID Translation Kafka Kafka Kafka Server 1 Server 2 Server 3 Kafka Cluster In-memory copy of data
  • 9. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Spark and TigerGraph 9
  • 10. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Spark + TigerGraph Data Pipeline
  • 11. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Typical Spark + TigerGraph Integration ● Data Preparation and Integration (TigerGraph/Spark) ● Unsupervised Learning (TigerGraph) ● Feature Extraction for Supervised Learning (TigerGraph/Spark) ● Model Training (Spark) ● Validate and Apply Model (TigerGraph) ● Visualize and Explore Interconnected Data (TigerGraph) 11
  • 12. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Spark and TigerGraph Data Pipeline Static Data Sources TigerGraph JDBC Driver Streaming Data Sources 12
  • 13. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | JDBC Driver ● Type 4 driver ● Support Read and Write bi-directional data flow to TigerGraph ● Read: Converts ResultSet to DataFrame ● Write: Load DataFrame and files to vertex/edge in TigerGraph ● Supports REST endpoints of built-in, compiled and interpreted GSQL queries from TigerGraph ● Open Source: ● https://github.com/tigergraph/ecosys/tree/master/tools/etl/tg-jdbc-driver 13
  • 14. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Supervised ML with TigerGraph - Detecting Phone-Based Fraud by Analyzing Network or Graph Relationship Features at China Mobile Download the solution brief at - https://info.tigergraph.com/MachineLearning 14
  • 15. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | DEMO 15
  • 16. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka and TigerGraph 16
  • 17. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka and TigerGraph Data Pipeline Static Data Sources Streaming Data Sources Kafka Loader 17
  • 18. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka Loader - Speed to Value from Real-time Streaming Data • Reduce Data Availability Gap and Accelerate Time to Value • Native Integration with Real-time Streaming Data and Batch Data • Enables Real-time Graph Feature Updates with Streaming Data in Machine Learning Use Cases • Decrease Learning Curve With Familiar Syntax • GSQL Support with Consistent Data Loading Syntax • Maintain Separation of Control for Data Loading • Designed with Built-in MultiGraph Support 18
  • 19. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka Loader : Three Steps Consistent with GSQL Data Loading Steps Step 1: Define the Data Source Step 2: Create a Loading Job Step 3: Run the Loading Job 19
  • 20. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Kafka Loader High Level Architecture ● Connect to External Kafka Cluster ● User Commands Through GSQL Server ● Configuration Settings: ○ Config 1: Kakfa Cluster Configuration ○ Config 2: Topic/Partition/Offset Info 20
  • 21. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | DEMO 21
  • 22. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | TigerGraph Architecture + Spark + Kakfa 22
  • 23. Get Started for Free ● Try TigerGraph Cloud ( tgcloud.io ) ● Download TigerGraph’s Developer Edition ● Take a Test Drive - Online Demo ● Get TigerGraph Certified ● Join the Community @TigerGraphDB /tigergraph /TigerGraphDB /company/TigerGraph 23