SlideShare a Scribd company logo
Cassandra and TitanDB
Insights into DataStax's Graph Strategy
Robin Schumacher – VP Products
Dr. Matthias Broecheler – Director of Engineering
Agenda
• Overview of DataStax
• Introduction to Graph
• Comparing Graph to an RDBMS
• A Look at DataStax’s Graph Strategy
• Next Steps
©2015 DataStax
Founded in April 2010
450+
Santa Clara, Austin, New York, London,
Paris, Tokyo, Sydney
410+
Employees Customers
30
Percent
Overview
1970s 1990s
Client-ServerMainframe
Evolution of Data Management
4
 Monolithic hardware
 Centralized workloads
 Vendor lock-in
 General purpose databases (one size fits all)
 Isolated / semi-connected
 Commodity hardware
 Distributed workloads
 Massive scalability
 Radically connected
Today
Cloud Mobile Social
Infrastructure centric Application / data centric
Cassandra – NoSQL for Modern Enterprise Workloads
Always on
Fully distributed
Best in scale and performance
80%+ contributions -> DataStax
Free tools and drivers
Free training
©2015 DataStax
San
Francisco
Stockholm
New York
Enabling The Internet Enterprise with DataStax Enterprise
©2015 DataStax
Introduction to Graph
What is a Graph Database?
©2015 DataStax
High Level Used to manage highly connected or complex data
User Level Used to support traversal and analytic queries against a data
model that uses vertices, edges and properties to represent
and store data
Technical Level Uses specialized index structures, data partitioning
techniques, and query optimizers to efficiently traverse large
graphs
What is a Graph Database?
©2015 DataStax



 



DataStax
DataBricks
Spark
DSE
CassandraJonathan Ellis
Robin
Schumacher
Billy
Bosworth
worksFor
title: VP Product
develops
uses
uses
reportsTo
worksFor
title: CTO
worksFor
title: CEO
What is a Graph Database?
©2015 DataStax



 



DataStax
DataBricks
Spark
DSE
CassandraJonathan Ellis
Robin
Schumacher
Billy
Bosworth
worksFor
title: VP Product
develops
uses
uses
reportsTo
worksFor
title: CTO
worksFor
title: CEO
Property
Edge
Vertex
A Graph Database Helps Answer Queries Like…
…should an initiated transaction be considered fraudulent or malicious based
on past user actions or normal patterns of system behavior?
…what products or actions should we recommend to a user based on their
preferences and behavioral patterns to maximize sales or user engagement?
…what campaigns should be run for different segments of a company’s
customer base?
©2015 DataStax
Comparing Graph DB to RDBMS
Key Difference Between Graph DB and RDBMS
©2015 DataStax
RDBMS Graph DB
Process to query data elements
(joins) is inefficient on large data
sets or many relationships
Better performance for relationship
queries due to specialized index
structures
Expressing JOIN-intensive queries
in SQL is time-consuming and error-
prone
Intuitive query language enabling
faster application development
RDBMS vs. Graph DB: Query Complexity
©2015 DataStax
SELECT TOP (5) [t14].[ProductName]
FROM (SELECT COUNT(*) AS [value],
[t13].[ProductName]
FROM [customers] AS [t0]
CROSS APPLY (SELECT [t9].[ProductName]
FROM [orders] AS [t1]
CROSS JOIN [order details] AS [t2]
INNER JOIN [products] AS [t3]
ON [t3].[ProductID] = [t2].[ProductID]
CROSS JOIN [order details] AS [t4]
INNER JOIN [orders] AS [t5]
ON [t5].[OrderID] = [t4].[OrderID]
LEFT JOIN [customers] AS [t6]
ON [t6].[CustomerID] = [t5].[CustomerID]
CROSS JOIN ([orders] AS [t7]
CROSS JOIN [order details] AS [t8]
INNER JOIN [products] AS [t9]
ON [t9].[ProductID] = [t8].[ProductID])
WHERE NOT EXISTS(SELECT NULL AS [EMPTY]
FROM [orders] AS [t10]
CROSS JOIN [order details] AS [t11]
INNER JOIN [products] AS [t12]
ON [t12].[ProductID] = [t11].[ProductID]
WHERE [t9].[ProductID] = [t12].[ProductID]
AND [t10].[CustomerID] = [t0].[CustomerID]
AND [t11].[OrderID] = [t10].[OrderID])
AND [t6].[CustomerID] <> [t0].[CustomerID]
AND [t1].[CustomerID] = [t0].[CustomerID]
AND [t2].[OrderID] = [t1].[OrderID]
AND [t4].[ProductID] = [t3].[ProductID]
AND [t7].[CustomerID] = [t6].[CustomerID]
AND [t8].[OrderID] = [t7].[OrderID]) AS [t13]
WHERE [t0].[CustomerID] = N'ALFKI'
GROUP BY [t13].[ProductName]) AS [t14]
ORDER BY [t14].[value] DESC
g.V('customerId','ALFKI').as('customer')
.out('ordered').out('contains').out('is').as('products')
.in('is').in('contains').in('ordered').except('customer')
.out('ordered').out('contains').out('is').except('products')
.groupCount().cap().orderMap(T.decr)[0..<5].productNa
me
VS.
RDBMS vs. Graph DB: Data Modeling
©2015 DataStax
SELECT TOP (5) [t14].[ProductName]
FROM (SELECT COUNT(*) AS [value],
[t13].[ProductName]
FROM [customers] AS [t0]
CROSS APPLY (SELECT [t9].[ProductName]
FROM [orders] AS [t1]
CROSS JOIN [order details] AS [t2]
INNER JOIN [products] AS [t3]
ON [t3].[ProductID] = [t2].[ProductID]
CROSS JOIN [order details] AS [t4]
INNER JOIN [orders] AS [t5]
ON [t5].[OrderID] = [t4].[OrderID]
LEFT JOIN [customers] AS [t6]
ON [t6].[CustomerID] = [t5].[CustomerID]
CROSS JOIN ([orders] AS [t7]
CROSS JOIN [order details] AS [t8]
INNER JOIN [products] AS [t9]
ON [t9].[ProductID] = [t8].[ProductID])
WHERE NOT EXISTS(SELECT NULL AS [EMPTY]
FROM [orders] AS [t10]
CROSS JOIN [order details] AS [t11]
INNER JOIN [products] AS [t12]
ON [t12].[ProductID] = [t11].[ProductID]
WHERE [t9].[ProductID] = [t12].[ProductID]
AND [t10].[CustomerID] = [t0].[CustomerID]
AND [t11].[OrderID] = [t10].[OrderID])
AND [t6].[CustomerID] <> [t0].[CustomerID]
AND [t1].[CustomerID] = [t0].[CustomerID]
AND [t2].[OrderID] = [t1].[OrderID]
AND [t4].[ProductID] = [t3].[ProductID]
AND [t7].[CustomerID] = [t6].[CustomerID]
AND [t8].[OrderID] = [t7].[OrderID]) AS [t13]
WHERE [t0].[CustomerID] = N'ALFKI'
GROUP BY [t13].[ProductName]) AS [t14]
ORDER BY [t14].[value] DESC
VS.
Comparing Graph DB to NoSQL
Key Difference Between Graph DB and NoSQL
©2015 DataStax
NoSQL Graph DB
Data model can’t represent relationships
between rows or documents requiring
application developers to maintain those
inside the application which is
cumbersome, inefficient, and error prone
Natively supports
relationships in the data
model and provides a query
language to efficiently
retrieve them
NoSQL vs. Graph DB: Query Expressivity
©2015 DataStax
g.V('customerId','ALFKI').as('customer')
.out('ordered').out('contains').out('is').as('products')
.in('is').in('contains').in('ordered').except('customer')
.out('ordered').out('contains').out('is').except('products')
.groupCount().cap().orderMap(T.decr)[0..<5].productNam
e
VS.?
(requires application code)
A Look at DataStax’s Graph Strategy
Product Strategy for 2015
© 2015 DataStax, All Rights Reserved. 20
• Part of DataStax’s product strategy in 2015 will be to support multiple
data models in DataStax Enterprise (DSE)
• Support for multi-model will occur across several releases of DSE in
2015
Why Multi-Model in DataStax Enterprise?
21
Transactions Analytics Search
Mixed Workload Needed?
Solved in DSE
Wide Row Graph JSON
Mixed Model Needed?
Solved in DSE
DSE
Analytics
Search
Transactions
DSE
Wide Row
JSON
Graph
Why Graph?
©2015 DataStax
Why Graph?
• Best answer for applications having highly connected data
• Key enabler of systems of engagement and systems of insight applications
• Use cases include:
• Personalization
• Social engagement systems (e.g. matchmaking services, contacts
catalogs, etc.)
• Fraud detection
• Financial analysis
• Security analysis
• Communication
• Supply chain management
©2015 DataStax
©2015 DataStax
Titan – the Foundation for DSE Graph
• Titan is a scalable, distributed graph database that is optimized for storing,
traversing and querying complex graph data in real time
• Titan is open source and licensed under the Apache 2
• Current technical benefits include:
• Built on top of Cassandra, Hbase, and BerkeleyDB
• Scale-out and multi-data center capable
• Able to support thousands of concurrent users and billions of graph data points
• Analytics on graph data supported via Hadoop integration
• Search enabled via support for Solr, Lucene, and Elasticsearch
©2015 DataStax
What is DataStax Enterprise Graph?
DSE Graph is a scalable graph database solution for modern Web and mobile
applications that need to manage highly connected data
DSE Graph will be deeply integrated into
the DSE platform:
• Tight Cassandra integration
• Graph analytics powered by Spark
• DSE Search support
• OpsCenter monitoring
©2015 DataStax
2015 Plans for Titan / DSE Graph
• DataStax will contribute to TinkerPop and is dedicated to making it the #1
open source graph framework
• Release Titan 1.0 (TP3 compatible; a prerequisite coming out 1-2 months
before)
• First release of DSE Graph to occur in DSE 5.0. EAP builds will be
available for interested customer
• Recommendations for customer are to continue to develop using
TinkerPop to ensure seamless compatibility with DSE Graph
• DataStax to provide utilities/instructions for moving existing Titan
databases to DSE Graph
©2015 DataStax
Next Steps
• Check DataStax blog for updates on DSE Graph
• If a current DSE customer, contact us about participating in upcoming
Early Adopter Program (EAP) releases of DSE Graph
• If haven’t tried DSE yet, download it from
http://www.datastax.com/download and follow our getting started guide in
your own environment (or use the DataStax Sandbox)
©2015 DataStax
Thank you!
Questions?

More Related Content

What's hot (20)

PPTX
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
DataStax
 
PPTX
Data Modeling Basics for the Cloud with DataStax
DataStax
 
PPTX
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
DataStax
 
PPTX
Transforms Document Management at Scale with Distributed Database Solution wi...
DataStax Academy
 
PPTX
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
DataStax
 
PPTX
How jKool Analyzes Streaming Data in Real Time with DataStax
DataStax
 
PPT
Reporting from the Trenches: Intuit & Cassandra
DataStax
 
PPTX
Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...
DataStax
 
PDF
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
DataStax
 
PDF
Designing a Distributed Cloud Database for Dummies
DataStax
 
PPTX
How to Successfully Visualize DSE Graph data
DataStax
 
PDF
Building Custom Big Data Integrations
Pat Patterson
 
PPTX
Intuit Analytics Cloud 101
DataWorks Summit/Hadoop Summit
 
PPTX
Webinar | Introducing DataStax Enterprise 4.6
DataStax
 
PPT
Webinar: 2 Billion Data Points Each Day
DataStax
 
PPTX
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
DataStax
 
PPTX
Data warehousing
nandini patil
 
PPTX
Webinar - Bringing connected graph data to Cassandra with DSE Graph
DataStax
 
PDF
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
PPTX
Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016
DataStax
 
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
DataStax
 
Data Modeling Basics for the Cloud with DataStax
DataStax
 
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
DataStax
 
Transforms Document Management at Scale with Distributed Database Solution wi...
DataStax Academy
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
DataStax
 
How jKool Analyzes Streaming Data in Real Time with DataStax
DataStax
 
Reporting from the Trenches: Intuit & Cassandra
DataStax
 
Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...
DataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
DataStax
 
Designing a Distributed Cloud Database for Dummies
DataStax
 
How to Successfully Visualize DSE Graph data
DataStax
 
Building Custom Big Data Integrations
Pat Patterson
 
Intuit Analytics Cloud 101
DataWorks Summit/Hadoop Summit
 
Webinar | Introducing DataStax Enterprise 4.6
DataStax
 
Webinar: 2 Billion Data Points Each Day
DataStax
 
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
DataStax
 
Data warehousing
nandini patil
 
Webinar - Bringing connected graph data to Cassandra with DSE Graph
DataStax
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016
DataStax
 

Viewers also liked (13)

PDF
DataStax: Titan 1.0: Scalable real time and analytic graph queries
DataStax Academy
 
PDF
Making Sense of Graph Databases
InfiniteGraph
 
PDF
DataStax | Graph Data Modeling in DataStax Enterprise (Artem Chebotko) | Cass...
DataStax
 
PDF
Graphs for Enterprise Architects
Neo4j
 
PPTX
NoSQL Graph Databases - Why, When and Where
Eugene Hanikblum
 
PDF
Finding Graph Isomorphisms In GraphX And GraphFrames
Spark Summit
 
PDF
Introduction to TitanDB
Knoldus Inc.
 
PDF
Titan: Scaling Graphs and TinkerPop3
Matthias Broecheler
 
PDF
The Gremlin Graph Traversal Language
Marko Rodriguez
 
PPTX
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
DataStax
 
PDF
Titan: The Rise of Big Graph Data
Marko Rodriguez
 
PDF
ACM DBPL Keynote: The Graph Traversal Machine and Language
Marko Rodriguez
 
PDF
Titan: Big Graph Data with Cassandra
Matthias Broecheler
 
DataStax: Titan 1.0: Scalable real time and analytic graph queries
DataStax Academy
 
Making Sense of Graph Databases
InfiniteGraph
 
DataStax | Graph Data Modeling in DataStax Enterprise (Artem Chebotko) | Cass...
DataStax
 
Graphs for Enterprise Architects
Neo4j
 
NoSQL Graph Databases - Why, When and Where
Eugene Hanikblum
 
Finding Graph Isomorphisms In GraphX And GraphFrames
Spark Summit
 
Introduction to TitanDB
Knoldus Inc.
 
Titan: Scaling Graphs and TinkerPop3
Matthias Broecheler
 
The Gremlin Graph Traversal Language
Marko Rodriguez
 
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
DataStax
 
Titan: The Rise of Big Graph Data
Marko Rodriguez
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
Marko Rodriguez
 
Titan: Big Graph Data with Cassandra
Matthias Broecheler
 
Ad

Similar to Data stax webinar cassandra and titandb insights into datastax graph strategy v3 wc (20)

PDF
DataStax: Datastax Enterprise - The Multi-Model Platform
DataStax Academy
 
PDF
Introduction to Graph Databases
DataStax
 
PPTX
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
PPTX
The year of the graph: do you really need a graph database? How do you choose...
George Anadiotis
 
ODP
How do You Graph
Ben Krug
 
PPTX
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
DataStax
 
PDF
Dgraph: Graph database for production environment
openCypher
 
PPTX
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
DataStax
 
PPTX
New Data Technologies, Graph Computing and Relationship Discovery in the Ente...
InfiniteGraph
 
PDF
Predictions for the Future of Graph Database
Neo4j
 
PPTX
Graph Database and Why it is gaining traction
Giridhar Chandrasekaran
 
PDF
Graph Gurus Episode 25: Unleash the Business Value of Your Data Lake with Gra...
TigerGraph
 
PPTX
OrientDB - the 2nd generation of (Multi-Model) NoSQL
Luigi Dell'Aquila
 
PDF
Keynote: Anything is Possible: Apply Graphs to Your Most Complex Data Problem...
Neo4j
 
PDF
Go fast in a graph world
Andrea Giuliano
 
PPTX
Introducing DataStax Enterprise 4.7
DataStax
 
PDF
GraphTech Ecosystem - part 1: Graph Databases
Linkurious
 
PDF
The Top 5 Factors to Consider When Choosing a Big Data Solution
DATAVERSITY
 
PPTX
Database awareness
kloia
 
DataStax: Datastax Enterprise - The Multi-Model Platform
DataStax Academy
 
Introduction to Graph Databases
DataStax
 
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
The year of the graph: do you really need a graph database? How do you choose...
George Anadiotis
 
How do You Graph
Ben Krug
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
DataStax
 
Dgraph: Graph database for production environment
openCypher
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
DataStax
 
New Data Technologies, Graph Computing and Relationship Discovery in the Ente...
InfiniteGraph
 
Predictions for the Future of Graph Database
Neo4j
 
Graph Database and Why it is gaining traction
Giridhar Chandrasekaran
 
Graph Gurus Episode 25: Unleash the Business Value of Your Data Lake with Gra...
TigerGraph
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL
Luigi Dell'Aquila
 
Keynote: Anything is Possible: Apply Graphs to Your Most Complex Data Problem...
Neo4j
 
Go fast in a graph world
Andrea Giuliano
 
Introducing DataStax Enterprise 4.7
DataStax
 
GraphTech Ecosystem - part 1: Graph Databases
Linkurious
 
The Top 5 Factors to Consider When Choosing a Big Data Solution
DATAVERSITY
 
Database awareness
kloia
 
Ad

More from DataStax (20)

PPTX
Is Your Enterprise Ready to Shine This Holiday Season?
DataStax
 
PPTX
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
DataStax
 
PPTX
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
DataStax
 
PPTX
Best Practices for Getting to Production with DataStax Enterprise Graph
DataStax
 
PPTX
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
DataStax
 
PPTX
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
PDF
Webinar | Better Together: Apache Cassandra and Apache Kafka
DataStax
 
PDF
Introduction to Apache Cassandra™ + What’s New in 4.0
DataStax
 
PPTX
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
DataStax
 
PPTX
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
DataStax
 
PDF
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
DataStax
 
PDF
How to Evaluate Cloud Databases for eCommerce
DataStax
 
PPTX
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
DataStax
 
PPTX
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
DataStax
 
PPTX
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
DataStax
 
PPTX
Datastax - The Architect's guide to customer experience (CX)
DataStax
 
PPTX
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
PPTX
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
DataStax
 
PPTX
Innovation Around Data and AI for Fraud Detection
DataStax
 
PPTX
How to get Real-Time Value from your IoT Data - Datastax
DataStax
 
Is Your Enterprise Ready to Shine This Holiday Season?
DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
DataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
DataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
DataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
DataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
DataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
DataStax
 
How to Evaluate Cloud Databases for eCommerce
DataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
DataStax
 
Datastax - The Architect's guide to customer experience (CX)
DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
DataStax
 
Innovation Around Data and AI for Fraud Detection
DataStax
 
How to get Real-Time Value from your IoT Data - Datastax
DataStax
 

Recently uploaded (20)

PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 

Data stax webinar cassandra and titandb insights into datastax graph strategy v3 wc

  • 1. Cassandra and TitanDB Insights into DataStax's Graph Strategy Robin Schumacher – VP Products Dr. Matthias Broecheler – Director of Engineering
  • 2. Agenda • Overview of DataStax • Introduction to Graph • Comparing Graph to an RDBMS • A Look at DataStax’s Graph Strategy • Next Steps ©2015 DataStax
  • 3. Founded in April 2010 450+ Santa Clara, Austin, New York, London, Paris, Tokyo, Sydney 410+ Employees Customers 30 Percent Overview
  • 4. 1970s 1990s Client-ServerMainframe Evolution of Data Management 4  Monolithic hardware  Centralized workloads  Vendor lock-in  General purpose databases (one size fits all)  Isolated / semi-connected  Commodity hardware  Distributed workloads  Massive scalability  Radically connected Today Cloud Mobile Social Infrastructure centric Application / data centric
  • 5. Cassandra – NoSQL for Modern Enterprise Workloads Always on Fully distributed Best in scale and performance 80%+ contributions -> DataStax Free tools and drivers Free training ©2015 DataStax San Francisco Stockholm New York
  • 6. Enabling The Internet Enterprise with DataStax Enterprise ©2015 DataStax
  • 8. What is a Graph Database? ©2015 DataStax High Level Used to manage highly connected or complex data User Level Used to support traversal and analytic queries against a data model that uses vertices, edges and properties to represent and store data Technical Level Uses specialized index structures, data partitioning techniques, and query optimizers to efficiently traverse large graphs
  • 9. What is a Graph Database? ©2015 DataStax         DataStax DataBricks Spark DSE CassandraJonathan Ellis Robin Schumacher Billy Bosworth worksFor title: VP Product develops uses uses reportsTo worksFor title: CTO worksFor title: CEO
  • 10. What is a Graph Database? ©2015 DataStax         DataStax DataBricks Spark DSE CassandraJonathan Ellis Robin Schumacher Billy Bosworth worksFor title: VP Product develops uses uses reportsTo worksFor title: CTO worksFor title: CEO Property Edge Vertex
  • 11. A Graph Database Helps Answer Queries Like… …should an initiated transaction be considered fraudulent or malicious based on past user actions or normal patterns of system behavior? …what products or actions should we recommend to a user based on their preferences and behavioral patterns to maximize sales or user engagement? …what campaigns should be run for different segments of a company’s customer base? ©2015 DataStax
  • 12. Comparing Graph DB to RDBMS
  • 13. Key Difference Between Graph DB and RDBMS ©2015 DataStax RDBMS Graph DB Process to query data elements (joins) is inefficient on large data sets or many relationships Better performance for relationship queries due to specialized index structures Expressing JOIN-intensive queries in SQL is time-consuming and error- prone Intuitive query language enabling faster application development
  • 14. RDBMS vs. Graph DB: Query Complexity ©2015 DataStax SELECT TOP (5) [t14].[ProductName] FROM (SELECT COUNT(*) AS [value], [t13].[ProductName] FROM [customers] AS [t0] CROSS APPLY (SELECT [t9].[ProductName] FROM [orders] AS [t1] CROSS JOIN [order details] AS [t2] INNER JOIN [products] AS [t3] ON [t3].[ProductID] = [t2].[ProductID] CROSS JOIN [order details] AS [t4] INNER JOIN [orders] AS [t5] ON [t5].[OrderID] = [t4].[OrderID] LEFT JOIN [customers] AS [t6] ON [t6].[CustomerID] = [t5].[CustomerID] CROSS JOIN ([orders] AS [t7] CROSS JOIN [order details] AS [t8] INNER JOIN [products] AS [t9] ON [t9].[ProductID] = [t8].[ProductID]) WHERE NOT EXISTS(SELECT NULL AS [EMPTY] FROM [orders] AS [t10] CROSS JOIN [order details] AS [t11] INNER JOIN [products] AS [t12] ON [t12].[ProductID] = [t11].[ProductID] WHERE [t9].[ProductID] = [t12].[ProductID] AND [t10].[CustomerID] = [t0].[CustomerID] AND [t11].[OrderID] = [t10].[OrderID]) AND [t6].[CustomerID] <> [t0].[CustomerID] AND [t1].[CustomerID] = [t0].[CustomerID] AND [t2].[OrderID] = [t1].[OrderID] AND [t4].[ProductID] = [t3].[ProductID] AND [t7].[CustomerID] = [t6].[CustomerID] AND [t8].[OrderID] = [t7].[OrderID]) AS [t13] WHERE [t0].[CustomerID] = N'ALFKI' GROUP BY [t13].[ProductName]) AS [t14] ORDER BY [t14].[value] DESC g.V('customerId','ALFKI').as('customer') .out('ordered').out('contains').out('is').as('products') .in('is').in('contains').in('ordered').except('customer') .out('ordered').out('contains').out('is').except('products') .groupCount().cap().orderMap(T.decr)[0..<5].productNa me VS.
  • 15. RDBMS vs. Graph DB: Data Modeling ©2015 DataStax SELECT TOP (5) [t14].[ProductName] FROM (SELECT COUNT(*) AS [value], [t13].[ProductName] FROM [customers] AS [t0] CROSS APPLY (SELECT [t9].[ProductName] FROM [orders] AS [t1] CROSS JOIN [order details] AS [t2] INNER JOIN [products] AS [t3] ON [t3].[ProductID] = [t2].[ProductID] CROSS JOIN [order details] AS [t4] INNER JOIN [orders] AS [t5] ON [t5].[OrderID] = [t4].[OrderID] LEFT JOIN [customers] AS [t6] ON [t6].[CustomerID] = [t5].[CustomerID] CROSS JOIN ([orders] AS [t7] CROSS JOIN [order details] AS [t8] INNER JOIN [products] AS [t9] ON [t9].[ProductID] = [t8].[ProductID]) WHERE NOT EXISTS(SELECT NULL AS [EMPTY] FROM [orders] AS [t10] CROSS JOIN [order details] AS [t11] INNER JOIN [products] AS [t12] ON [t12].[ProductID] = [t11].[ProductID] WHERE [t9].[ProductID] = [t12].[ProductID] AND [t10].[CustomerID] = [t0].[CustomerID] AND [t11].[OrderID] = [t10].[OrderID]) AND [t6].[CustomerID] <> [t0].[CustomerID] AND [t1].[CustomerID] = [t0].[CustomerID] AND [t2].[OrderID] = [t1].[OrderID] AND [t4].[ProductID] = [t3].[ProductID] AND [t7].[CustomerID] = [t6].[CustomerID] AND [t8].[OrderID] = [t7].[OrderID]) AS [t13] WHERE [t0].[CustomerID] = N'ALFKI' GROUP BY [t13].[ProductName]) AS [t14] ORDER BY [t14].[value] DESC VS.
  • 16. Comparing Graph DB to NoSQL
  • 17. Key Difference Between Graph DB and NoSQL ©2015 DataStax NoSQL Graph DB Data model can’t represent relationships between rows or documents requiring application developers to maintain those inside the application which is cumbersome, inefficient, and error prone Natively supports relationships in the data model and provides a query language to efficiently retrieve them
  • 18. NoSQL vs. Graph DB: Query Expressivity ©2015 DataStax g.V('customerId','ALFKI').as('customer') .out('ordered').out('contains').out('is').as('products') .in('is').in('contains').in('ordered').except('customer') .out('ordered').out('contains').out('is').except('products') .groupCount().cap().orderMap(T.decr)[0..<5].productNam e VS.? (requires application code)
  • 19. A Look at DataStax’s Graph Strategy
  • 20. Product Strategy for 2015 © 2015 DataStax, All Rights Reserved. 20 • Part of DataStax’s product strategy in 2015 will be to support multiple data models in DataStax Enterprise (DSE) • Support for multi-model will occur across several releases of DSE in 2015
  • 21. Why Multi-Model in DataStax Enterprise? 21 Transactions Analytics Search Mixed Workload Needed? Solved in DSE Wide Row Graph JSON Mixed Model Needed? Solved in DSE DSE Analytics Search Transactions DSE Wide Row JSON Graph
  • 23. Why Graph? • Best answer for applications having highly connected data • Key enabler of systems of engagement and systems of insight applications • Use cases include: • Personalization • Social engagement systems (e.g. matchmaking services, contacts catalogs, etc.) • Fraud detection • Financial analysis • Security analysis • Communication • Supply chain management ©2015 DataStax
  • 25. Titan – the Foundation for DSE Graph • Titan is a scalable, distributed graph database that is optimized for storing, traversing and querying complex graph data in real time • Titan is open source and licensed under the Apache 2 • Current technical benefits include: • Built on top of Cassandra, Hbase, and BerkeleyDB • Scale-out and multi-data center capable • Able to support thousands of concurrent users and billions of graph data points • Analytics on graph data supported via Hadoop integration • Search enabled via support for Solr, Lucene, and Elasticsearch ©2015 DataStax
  • 26. What is DataStax Enterprise Graph? DSE Graph is a scalable graph database solution for modern Web and mobile applications that need to manage highly connected data DSE Graph will be deeply integrated into the DSE platform: • Tight Cassandra integration • Graph analytics powered by Spark • DSE Search support • OpsCenter monitoring ©2015 DataStax
  • 27. 2015 Plans for Titan / DSE Graph • DataStax will contribute to TinkerPop and is dedicated to making it the #1 open source graph framework • Release Titan 1.0 (TP3 compatible; a prerequisite coming out 1-2 months before) • First release of DSE Graph to occur in DSE 5.0. EAP builds will be available for interested customer • Recommendations for customer are to continue to develop using TinkerPop to ensure seamless compatibility with DSE Graph • DataStax to provide utilities/instructions for moving existing Titan databases to DSE Graph ©2015 DataStax
  • 28. Next Steps • Check DataStax blog for updates on DSE Graph • If a current DSE customer, contact us about participating in upcoming Early Adopter Program (EAP) releases of DSE Graph • If haven’t tried DSE yet, download it from http://www.datastax.com/download and follow our getting started guide in your own environment (or use the DataStax Sandbox) ©2015 DataStax

Editor's Notes

  • #4: We were founded in April 2010, so we are almost 5 years old. On December 1st, 2013, we were 150 employees, we are now over 380. That’s about a 250% growth in little over 12 months - impressive growth by any standards. We are headquartered in Santa Clara, California with offices in Austin, New York, London, Paris, Tokyo and Sydney, and we have transacted business in over 50 countries around the world. We have over 400 customers that span many different verticals and use cases. As you can see here, we have decided to show our customer logos on an iPhone. That is not because it’s the done thing in presentations today but rather to impress upon you that if you are using an iPhone, IOS or Android device, the chances are you are already using our technology as you move through your day. Currently 29 of the Fortune 100 use DataStax as their database technology of choice and this number is rising. We are on a remarkable journey and this rate and trajectory of growth is nt letting up. So why have we been experiencing such incredible growth and success.  
  • #5: Describe the evolution of various transactional database technologies from mainframe to todays distributed databases.
  • #15: http://sql2gremlin.com/#complex/recommendation
  • #16: http://sql2gremlin.com/#complex/recommendation
  • #19: http://sql2gremlin.com/#complex/recommendation
  • #28: Titan 1.0 in Summer. DSE Graph in Winter.