Graph Tour Washington DC
#1 Database for Connected Data
Jeff Morris
Head of Product Marketing
jeff@neo4j.com
5/7/19
I’m still listening to a lot of graph-y books
Adjacent Possibilities Think in Maps Connecting with PeopleJPL Innovation
Uniqueness of Individuals Practice, Practice, Practice
Food
Journey
Space
Journey
Human Senses
InnovationStartups
FATHER_OF
DRIVE
S
My Graph
Agenda
• Great Graph Stories are here in Washington DC
• State of the Graph in 2019
• Innovation Waves
• Looking ahead at Recommendations, AI and Graphs
Neo4j Is Helping The World To Make Sense of Data
5
ICIJ used Neo4j to uncover the
world’s largest journalistic
leak to date, The Panama
Papers
NASA uses Neo4j for a
“Lessons Learned” database
to improve effectiveness in
search missions in space
Neo4j is used to graph the
human body, map correlations,
identify cause & effect and
search for the cure for cancer
SAVING
DEMOCRACY
MISSION TO
MARS
CURING CANCER
In-Q-Tel’s Mission Economy
6
• Venture Capital sponsored by
National Intelligence
• Decomposes and reassembles
technology stacks into common
“genome” vocabulary
• Matches mission problems to
technology assemblies and
vendors
• Evaluates tech across
communications, Bio tech,
robotics, software, hardware, IoT
• Faster evaluations, better
innovations
ACCOUNT
ADDRESS
PERSON
PERSON
NAME
STREET
BANK
NAME
COMPANY
BANK
BAHAMAS
2.6 TB
11.5 million documents
Emails, Scanned Documents,
Bank Statements etc…
2.6 TB
11.5 million documents
Emails, Scanned Documents,
Bank Statements etc… Person
B
Bank US
Account
123
Person
A
Acme
Inc
Bank
Bahama
s
Address
XNODE
RELATIONSHIP
ICIJ Pulitzer Price Winner 2017
Paradise Papers Metadata Model
Business Problem
• Find relationships between people, corporations, accounts,
shell companies and offshore accounts
• Journalists are non-technical
• 2017 Leak from Appleby tax sheltering law firm matched
13.4 million account records with public business
registrations data from across Caribbean
Solution and Benefits
• Exposed tax sheltering practices of Apple, Nike
• Revealed hidden connections among politicians and nations,
like Wilbur Ross & Putin’s son in law
• Triggered government tax evasion investigations in US, UK,
Europe, India, Australia, Bermuda, Canada and Cayman
Islands within 2 days.
• Granted $1M endowment from Golden Globes’ HFPA
Background
• International Consortium of Investigative Journalists (ICIJ),
Pulitzer Prize winning journalists
• Fourth blockbuster investigation using Neo4j to reveal
connections in text-based, and account-based data leaked
from offshore law firms and government records about the
“1% Elite”
• Appends Neo4j-based, “Offshore Leaks Database”
ICIJ Paradise Papers INVESTIGATIVE JOURNALISM
Fraud Detection / Knowledge Graph16
Background
• US IT consulting firm helped US Army streamline
equipment deployments and maintenance spending
• Saving lives by improving the operational readiness
of Army equipment like tanks, radios, transports,
aircraft, weaponry, etc.
Business Problem
• Needed to modernize procurement, budget and
logistics processes for equipment & spare parts
• Millions of connections among a tank’s bill-of-
materials, for example
• Improve “what if” cost calculations when planning
missions and troop deployments
• Mainframe systems required over 60 man-hrs to
calculate changes… planning took too long.
Solution and Benefits
• 118M nodes & 185M relationships
• Shed cost estimation times by 88%
• Improved parts delivery timing and accuracy
• DBA labor required dropped by 77%
• Equipment TCO more predictable
• Safer soldiers
US Army / Calibre Systems Equipment Logistics
Parts Assembly & Equipment Maintenance19
State of the graph in 2019
2000+
7/10
12/25
8/10
53K+
100+
300+
450+
Adoption
Top Retail Firms
Top Financial Firms
Top Software Vendors
Customers Partners
• Creator of the Neo4j Graph Platform
• 250+ employees
• HQ in Silicon Valley, other offices include
London, Munich, Paris and Malmö Sweden
• $80M Series E led by Morgan Stanley &
One Peak.
• $160M total raised to date
• Over 20M+ downloads & container pulls
• 300+ enterprise subscription customers
with over half with >$1B in revenue
Ecosystem
Startup Program Alumni
Enterprise customers
Partners
Meet up members
Events per year
Neo4j - The Graph Company
2
1
The Industry’s Largest Dedicated Investment in Graphs
Networks of People Business Processes Knowledge Networks
E.g., Risk management, Supply
chain, Payments
E.g., Employees, Customers,
Suppliers, Partners,
Influencers
E.g., Enterprise content,
Domain specific content,
eCommerce content
Data connections are increasing as rapidly as data volumes
The Rise of Connections in Data
Electronic Networks
On-prem & cloud
computing, Cellular,
Telco & Internet, IoT,
Blockchain
CAR
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Latitude: 37.5629900°
Longitude: -122.3255300°
Nodes
• Can have Labels to classify nodes
• Labels have native indexes
Relationships
• Relate nodes by type and direction
Properties
• Attributes of Nodes & Relationships
• Stored as Name/Value pairs
• Can have indexes and composite indexes
• Visibility security by user/role
Neo4j Invented the Labeled Property Graph Model
MARRIED TO
LIVES WITH
PERSON PERSON
23
24
Graph Databases are Designed for Connected Data
TRADITIONAL
DATABASES
BIG DATA
TECHNOLOGY
Store and retrieve data Aggregate and filter data Connections in data
Real time storage & retrieval Real-Time Connected Insights
Long running queries
aggregation & filtering
“Our Neo4j solution is literally thousands of times
faster than the prior MySQL solution, with queries that
require 10-100 times less code”
Volker Pacher, Senior Developer
Up to
3
Max
# of
hops
1 Millions
Internal & Confidential, Neo4j Inc.
25
Neo4j Graph Advantage: Foundational Components
1
2
3
4
5
6
Index-Free Adjacency
In memory and on flash/disk
vs
ACID Foundation
Required for safe writes
Full-Stack Clustering
Causal consistency
Language, Drivers, Tooling
Developer Experience,
Graph Efficiency, Type Safety
Graph Engine
Cost-Based Optimizer, Graph
Statistics, Cypher Runtime
Hardware Optimizations
For next-gen infrastructure
26
Strongly Differentiated Commercial Offering
Enterprise Edition is Highly Differentiated from Community Open Source Edition
Date/Time data type1
✔ ✔
3D Geospatial data types1
✔ ✔
Native String Indexes – up to 5x faster writes1
✔ ✔
100B+ Bulk Importer1
✔ Resumable1
Enterprise Cypher Runtime up to 70% faster – ✔
Hot Backups – 2x Faster1
ACID Transactions ✔ ✔
High-performance native API ✔ ✔
High-performance caching ✔ ✔
Cost-based query optimizer ✔ ✔
Graph algorithms library to support AI
initiatives ✔ ✔
Massively parallel graph algorithms – ✔
Query monitoring with enriched metrics – ✔
User and role-based security – ✔
LDAP and Active Directory Integration – ✔
Kerberos security option – ✔
Multi-Clustering
(partition of clusters)1 – ✔
Automatic Cache Warming1 – ✔
Rolling Upgrades1 – ✔
Resumable Copy/
Restore Cluster Member
– ✔
New diagnostic metrics
and support tools1 – ✔
Property Blacklisting – ✔
Language drivers for Java, Python, C# &
JavaScript ✔ ✔
Bolt Binary Protocol ✔ ✔
RPM, Debian, Docker, Azure & AWS Cloud
Delivery ✔ ✔
Intra-cluster encryption secures all traffic
across data centers and cloud zones
– ✔
IPv6 support in clustered deployments – Available
High throughput, least-connected load
balancing built into Bolt drivers
– ✔
Causal Clustering, core and read-replica design
at global scale for applications, analytics
workflows, HA and DR
– ✔
Enterprise Lock Manager accesses all cores on
server
– ✔
Labeled property graph model ✔ ✔
Native graph processing & storage ✔ ✔
Cypher graph query language ✔ ✔
Neo4j Browser with syntax highlighting ✔ ✔
Fast writes via native label indexes ✔ ✔
Composite Indexes ✔ ✔
Cypher for Apache Spark (CAPS) for big data
analytics ✔ ✔
Graph size limitations 34B nodes None
Auto reuse of deleted space – ✔
Property existence constraints – ✔
Cypher query tracing, monitoring and metrics – ✔
Node Key schema constraints – ✔
Neo4j Desktop: Free developer-friendly
package with full database and tools
– ✔
CommunityDatabase Features Architectural Features Graph Platform Features
1New in Neo4j 3.4
Enterprise Community Enterprise Community Enterprise
Neo4j Graph Platform Vision
27
Development &
Administration
Analytics
Tooling
BUSINESS
USERS
DEVELOPERS
ADMINS
Graph
Analytics
Graph
Transactions
Data Integration
Discovery & Visualization
DATA
ANALYSTS
DATA
SCIENTISTS
Drivers & APIs
APPLICATIONS
AI
openCypherCloud
Development &
Administration
Analytics
Tooling
Graph
Analytics
Graph
Transactions
Data Integration
Discovery & VisualizationDrivers & APIs
AI
Neo4j Database 3.4 & 3.5
• Full Text Search
• Native Indexes
(up to 5x faster writes)
• 100B+ bulk importer
Improved Admin Experience
• Rolling upgrades
• 2x faster backups
• Cache Warming on startup
• Improved diagnostics
Morpheus for Apache Spark
• Graph analytics in the data lake
• In-memory Spark graphs from
Apache Hadoop, Hive,
Gremlin and Spark
• Save graphs into Neo4j
• High-speed data exchange
between Neo4j & data lake
• Progressive analysis using
named graphs
Graph Data Science
• High speed graph
algorithms
Neo4j Bloom
• New graph illustration and
communication tool for non-
technical users
• Explore and edit graph
• Search-based
• Create storyboards
• Foundation for graph data
discovery
• Integrated with graph platform
Multi-Cluster routing built into Bolt drivers
• Date/Time data type
• 3-D Geospatial search
• Secure, Horizontal Multi-Clustering
• Property-value Security
The Neo4j Graph Platform
Graph Apps
30
Neo4j Bloom
31
• High fidelity
• Scene navigation
• Property views
• Search suggestions
• Saved phrase history
• Property editor
• Schema perspectives
• Bloom chart type
• Visualize
• Communicate
• Discover
• Navigate
• Isolate
• Edit
• Share
Neo4j Fabric
Schema-Based
Security
Multi-
DatabaseNeo4j 4.0
3
2
Reactive
Drivers
And more!!
Graphs Are VERY Hungry for Data
Graphs’ appetite to connect more data accelerates the ability to find
adjacent innovations
Customer iteration cycles from 2 weeks to 3 months
Graph Database Surging in Popularity
34
20M+
Downloads
8M+ from Neo4j Distribution
12M+ from Docker
Events
400+
Approximate Number of
Neo4j Events per Year
50k+
Meetups
Number of Meetup
Members Globally
50k+
Trained/certified Neo4j
professionals
1k Certified
Trained Developers
Largest Pool of Graph Technologists
Density Drives Value In Graphs
Metcalfe’s Law of the Network (V=n2)
5 hops < less Value
100’s of hops deliver
immense VALUE
"Neo4j continues to dominate the graph
database market.”
“69% of enterprises have, or are planning
to implement graphs over next 12 months”
October, 2017
“The most widely stated
reason in the survey for
selecting Neo4j was
to drive innovation”
February, 2018
Critical Capabilities for DBMA
“In fact, the rapid rise of Neo4j and other graph
technologies may signal that data
connectedness is indeed a separate paradigm
from the model consolidation happening across
the rest of the NoSQL landscape.”
March, 2018
Analysts See Unique Benefits of Graphs
"Neo4j is the clear market leader in the graph space. It has the
most users, it uses a widely adopted language that is much easier
than Gremlin and in many respects, it has consistently been a lot
more innovative than its competitors.”
“It is the Oracle or SQL Server of the graph database world.”
March, 2019
"Our research suggests that graph databases have the
best chance to survive and thrive as a distinct
category (versus the other NoSQL models) because
connected data applications present serious performance
problems that only a specialized graph DB can solve.”
March, 2019
Neo4j Has a Ten Year Head Start
Native Connectedness Differentiates Neo4j
Conceive
Code
Compute
Store
Non-Native Graph DBNative Graph DB RDBMS
Optimized for graph workloads
Graph Database Vendor Landscape
3
9
NEO4J SIGNIFICANTLY OUTPACES COMPETITION IN GRAPH LEADERSHIP & INVESTMENT, TECHNOLOGY
CAPABILITY, COMMUNITY BREADTH AND PRODUCT MATURITY
Graph Pioneer & Leader
Architectures optimized for
non-graph workloads.
Not easily adaptable for
graphs. Lack “minutes to
milliseconds” performance.
Few graph-expert resources
Nascent products fall
vastly short.
Graphs as a checkbox.
Slow performance.
Playing ‘catch-up,’ requiring
years to stabilize & grow.
Aggressive posture & claims
to secure PR.
Many fail the “kill -9” test
Graph pioneer & visionary.
Largest, most active community.
More customer successes than all
other vendors combined.
Strongest technology.
Diverse roadmap: cloud, DBaaS,
Spark, Algos for AI, GQL.
40
Real-Time
Recommendations
Fraud
Detection
Network &
IT Operations
Master Data
Management
Knowledge
Graph
Identity & Access
Management
Common Graph Technology Use Cases
AirBnb
Highly Valuable Connected Data Use Cases
Drive Enterprise Adoption
41
Real-Time
Recommendations
Fraud
Detection
Network &
IT Operations
Master Data
Management
Identity & Access
Management
Knowledge
Graph
Background
• Over 7M citizens suffer from Diabetes
• Connecting over 400 researchers
• Incorporates over 50 databases, 100k’s of Excel
workbooks, 30 database of biological samples
• Sought to examine disease from as many angles as
possible.
Business Problem
• Genes are connected by proteins or to metabolites,
and patients are connected with their diets, etc…
• Needed to improve the utilization of immensely
technical data
• Needed to cater to doctors and researchers with
simple navigation, communication and connections
of the graph.
Solution and Benefits
• Dr. Alexander Jarasch, Head of Bioinformatics and
Data Management
• Scientists can conduct parallel research without
asking the same questions or repeating tests
• Built views like a liver sample knowledge graph
DZD - German Center for Diabetes Research
Medical Genomic Research43
EE Customer since 2016
Q4
Software
Financial
Services Telecom
Retail & Consumer
Goods
Media &
Entertainment Other Industries
Airbus
Over 300 Enterprises and 10s of
Thousands of Projects on Neo4j
Background
• Fortune 100 heavy equipment manufacturer
• 27 Million warranty & service documents parsed
• Foundation for AI-based supply chain management
Business Problem
• Improve maintenance predictability
• Need a knowledge base for 27 million warranty
documents and maintenance orders
• Graphs gather context for AI to identify ‘prime
examples’ of connections among parts, suppliers,
customers and their mechanics anticipate when
equipment will need servicing and by whom.
Solution and Benefits
• Text to knowledge graph
• Common ontology for complaints, symptoms & parts
• Anticipates when equipment will need servicing
• Improves customer and brand satisfaction
• Maximizes lifespan and value of equipment
Caterpillar Heavy Equipment Manufacturing
Parts Assembly & Equipment Maintenance45
7 of the Top 10 Software
Companies Use Neo4j
Background
• Social network of 10M graphic artists
• Peer-to-peer evaluation of art and works-in-progress
• Job sourcing site for creatives
• Massive, millions of updates (reads & writes) to
Activity Feed
• 150 Mongos to 48 Cassandras to 3 Neo4j’s!
Business Problem
• Artists subscribe, appreciate and curate “galleries”
of works of their own and from other artists
• Activities Feed is how everyone receives updates
• 1st implementation was 150 MongoDB instances
• 2nd implementation shrunk to 48 Cassandras, but it
was still too slow and required heavy IT overhead
Solution and Benefits
• 3rd implementation shrunk to 3 Neo4j instances
• Saved over $500k in annual AWS fees
• Reduced data footprint from 50TB to 40GB
• Significantly easier to introduce new features like,
“New projects in you Network”
Adobe Behance Social Network of 10M Graphic Artists
Social Network47
EE Customer since 2016
Q4
8 of the Top 10 Insurance
Companies Use Neo4j
Home
Security
Internet of
things
Institutional
Memory
Entertainment
Recommendations
Home
Operations
Personalization
Voice Enabled Smart Home
51
Background
• Largest Cable TV & Internet Provider in US
• 3rd Largest network on the planet
• xFi is consumer experience in 3M houses
• Internet, router, devices, security, voice & telephony
• Transformational customer experience
Business Problem
• Integrate all experience in a smart home
• Create innovative ideas based on cross-platform
and household member preferences
• Add integrated value of xFinity triple play & quad-
play services (internet, VoIP, cable TV & home
security)
Solution and Benefits
• Custom content per household member
• Security reminders (kids are home, garage left open)
• Serves millions of households
• Makes content recommendations based on
occupant, time of day, permissions and preferences
• Has Siri-like voice commands
COMCAST Xfinity xFi TELECOMMUNICATIONS
Smart Home / Internet of Things52
EE Customer since 2016
Q4
Analog to Digital Innovations
Common Graph Entities are Analog
People
Locations
Processes
Devices
Objects
Motives
• Who – People
• What – Activities & Events
• Where – Locations
• When – Time
• Why – Motives & Feelings
• How – Processes, Devices &
Networks
Activities
The Whiteboard Model Is the Physical Model
55
Ideation is an analog
activity
• Easily understood
• Easily evolved
• Easy collaboration
between business
and IT
Neo4j Innovation Lab
56
Innovation Lab Illustrated
57
Graphs Drive Innovation
58
Context Paths
Auto-Graphs
Graph Layers
1st Order Graph
Cross-Connect
Cross-tech applications
Internet of Things operations
Transparent Neural
Networks
Blockchain-managed
systems
Adjacent graph layers inspire
new innovations
Metadata / Risk Management
Knowledge Graphs
AI- Powered Customer
Experiences
Connect unlike objects such
as people to products,
locations
Mobile app explosion
Recommendation engines
Fraud detectors
Desire for more context to
follow connections
Extract properties during
traversals
Connects like objects
People, computer networks,
telco, etc
Cypher: Powerful and Expressive Query Language
MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse)
MARRIED_TO
Dan Ann
NODE RELATIONSHIP TYPE
LABEL PROPERTY VARIABLE
60
The GQL Manifesto: https://gql.today/
• Introduced in May 2018: https://gql.today/
• An initiative to immediately
rally support for a unified
Graph Query Language
• Standards meetings are ongoing
• All community members
are encouraged to Vote
their support at
https://gql.today/#vote
Keith Hare, GraphConnect 2018
61
Data Sources
CLIENT Admin Dashboard
Session
Data
Feedback
Scored
Recommen-
dations
Graph
Algorithms
AI / ML
Click
Stream
Data
INTELLIGENT RECOMMENDATIONS FRAMEWORK
Discovery
Exclude
Boost
Diversity
User Segmentation
Item Similarity
Intelligent Recommendations Framework
Recommendation Engines
62
Graph Analytics
Graph
Algorithms
Cypher for
Apache Spark™
Graph-Enhanced AI
& ML
Similarity
ML
Graph & ML Algorithms in Neo4j
+35
neo4j.com/
graph-
algorithms-
book/
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Link Prediction
Finds optimal paths
or evaluates route
availability and quality
Determines the
importance of distinct
nodes in the network
Detects group
clustering or partition
options
Evaluates how
alike nodes are
Estimates the
likelihood of nodes
forming a future
relationship
Similarity
Graph and ML Algorithms in Neo4j
• Parallel Breadth First Search &
DFS
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• Minimum Spanning Tree
• A* Shortest Path
• Yen’s K Shortest Path
• K-Spanning Tree (MST)
• Random Walk
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality
• Approximate Betweenness Centrality
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity – 1 Step & Multi-
Step
• Balanced Triad (identification)
• Euclidean Distance
• Cosine Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
neo4j.com/docs/
graph-algorithms/current/
Updated April 2019
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
Graph Analytics:
SparkCypher & Morpheus
Objective: Draw new users from the Spark ecosystem
to graphs & Neo4j
(Also bolsters Cypher as the de-facto query language)
AI & Graphs
EVIDENCE
BASED
MACHINE
LEARNING SYSTEMS
PRESCRIPTE
ANALYTICS
NATURAL LANGUAGE GENERATION
“Yankees”
“Giants”
“Penguins”
“Jets”
“Bears”
“Red Soxs”
NLP/TEXT MINING
PREDICITVE
ANALYTICS
RECOMMENDATION
ENGINES
DEEP
LEARNING
Graphs Provide Connections
& Context for AI
Knowledge Graphs
70 GraphConnect speakers 2015-2017
Thomson Reuters Graph
71
• Data Fusion for Portfolio
Managers
• Graph layers
1. Knowledge Graphs
Context for Decisions
2. Connected
Feature Extraction
Context for Credibility
4. AI Explainability3. Graph-
Accelerated AI
Context for Efficiency
Context for Accuracy
Four Pillars of Graph-Enhanced AI
More Data Enables
More Use Cases
Data Network Effect
“A product, generally powered by machine learning, becomes smarter
as it gets more data from your users. The more users use your product,
the more data they contribute; the more data they contribute, the
smarter your product becomes.”
— Matt Turck
A Highly Connected Future
Your Homework - Connect
Enjoy the Conference!

Graph tour keynote 2019

  • 1.
    Graph Tour WashingtonDC #1 Database for Connected Data Jeff Morris Head of Product Marketing [email protected] 5/7/19
  • 2.
    I’m still listeningto a lot of graph-y books Adjacent Possibilities Think in Maps Connecting with PeopleJPL Innovation Uniqueness of Individuals Practice, Practice, Practice Food Journey Space Journey Human Senses InnovationStartups
  • 3.
  • 4.
    Agenda • Great GraphStories are here in Washington DC • State of the Graph in 2019 • Innovation Waves • Looking ahead at Recommendations, AI and Graphs
  • 5.
    Neo4j Is HelpingThe World To Make Sense of Data 5 ICIJ used Neo4j to uncover the world’s largest journalistic leak to date, The Panama Papers NASA uses Neo4j for a “Lessons Learned” database to improve effectiveness in search missions in space Neo4j is used to graph the human body, map correlations, identify cause & effect and search for the cure for cancer SAVING DEMOCRACY MISSION TO MARS CURING CANCER
  • 6.
    In-Q-Tel’s Mission Economy 6 •Venture Capital sponsored by National Intelligence • Decomposes and reassembles technology stacks into common “genome” vocabulary • Matches mission problems to technology assemblies and vendors • Evaluates tech across communications, Bio tech, robotics, software, hardware, IoT • Faster evaluations, better innovations
  • 8.
  • 9.
    2.6 TB 11.5 milliondocuments Emails, Scanned Documents, Bank Statements etc… Person B Bank US Account 123 Person A Acme Inc Bank Bahama s Address XNODE RELATIONSHIP
  • 13.
  • 14.
  • 16.
    Business Problem • Findrelationships between people, corporations, accounts, shell companies and offshore accounts • Journalists are non-technical • 2017 Leak from Appleby tax sheltering law firm matched 13.4 million account records with public business registrations data from across Caribbean Solution and Benefits • Exposed tax sheltering practices of Apple, Nike • Revealed hidden connections among politicians and nations, like Wilbur Ross & Putin’s son in law • Triggered government tax evasion investigations in US, UK, Europe, India, Australia, Bermuda, Canada and Cayman Islands within 2 days. • Granted $1M endowment from Golden Globes’ HFPA Background • International Consortium of Investigative Journalists (ICIJ), Pulitzer Prize winning journalists • Fourth blockbuster investigation using Neo4j to reveal connections in text-based, and account-based data leaked from offshore law firms and government records about the “1% Elite” • Appends Neo4j-based, “Offshore Leaks Database” ICIJ Paradise Papers INVESTIGATIVE JOURNALISM Fraud Detection / Knowledge Graph16
  • 19.
    Background • US ITconsulting firm helped US Army streamline equipment deployments and maintenance spending • Saving lives by improving the operational readiness of Army equipment like tanks, radios, transports, aircraft, weaponry, etc. Business Problem • Needed to modernize procurement, budget and logistics processes for equipment & spare parts • Millions of connections among a tank’s bill-of- materials, for example • Improve “what if” cost calculations when planning missions and troop deployments • Mainframe systems required over 60 man-hrs to calculate changes… planning took too long. Solution and Benefits • 118M nodes & 185M relationships • Shed cost estimation times by 88% • Improved parts delivery timing and accuracy • DBA labor required dropped by 77% • Equipment TCO more predictable • Safer soldiers US Army / Calibre Systems Equipment Logistics Parts Assembly & Equipment Maintenance19
  • 20.
    State of thegraph in 2019
  • 21.
    2000+ 7/10 12/25 8/10 53K+ 100+ 300+ 450+ Adoption Top Retail Firms TopFinancial Firms Top Software Vendors Customers Partners • Creator of the Neo4j Graph Platform • 250+ employees • HQ in Silicon Valley, other offices include London, Munich, Paris and Malmö Sweden • $80M Series E led by Morgan Stanley & One Peak. • $160M total raised to date • Over 20M+ downloads & container pulls • 300+ enterprise subscription customers with over half with >$1B in revenue Ecosystem Startup Program Alumni Enterprise customers Partners Meet up members Events per year Neo4j - The Graph Company 2 1 The Industry’s Largest Dedicated Investment in Graphs
  • 22.
    Networks of PeopleBusiness Processes Knowledge Networks E.g., Risk management, Supply chain, Payments E.g., Employees, Customers, Suppliers, Partners, Influencers E.g., Enterprise content, Domain specific content, eCommerce content Data connections are increasing as rapidly as data volumes The Rise of Connections in Data Electronic Networks On-prem & cloud computing, Cellular, Telco & Internet, IoT, Blockchain
  • 23.
    CAR name: “Dan” born: May29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Latitude: 37.5629900° Longitude: -122.3255300° Nodes • Can have Labels to classify nodes • Labels have native indexes Relationships • Relate nodes by type and direction Properties • Attributes of Nodes & Relationships • Stored as Name/Value pairs • Can have indexes and composite indexes • Visibility security by user/role Neo4j Invented the Labeled Property Graph Model MARRIED TO LIVES WITH PERSON PERSON 23
  • 24.
    24 Graph Databases areDesigned for Connected Data TRADITIONAL DATABASES BIG DATA TECHNOLOGY Store and retrieve data Aggregate and filter data Connections in data Real time storage & retrieval Real-Time Connected Insights Long running queries aggregation & filtering “Our Neo4j solution is literally thousands of times faster than the prior MySQL solution, with queries that require 10-100 times less code” Volker Pacher, Senior Developer Up to 3 Max # of hops 1 Millions
  • 25.
    Internal & Confidential,Neo4j Inc. 25 Neo4j Graph Advantage: Foundational Components 1 2 3 4 5 6 Index-Free Adjacency In memory and on flash/disk vs ACID Foundation Required for safe writes Full-Stack Clustering Causal consistency Language, Drivers, Tooling Developer Experience, Graph Efficiency, Type Safety Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime Hardware Optimizations For next-gen infrastructure
  • 26.
    26 Strongly Differentiated CommercialOffering Enterprise Edition is Highly Differentiated from Community Open Source Edition Date/Time data type1 ✔ ✔ 3D Geospatial data types1 ✔ ✔ Native String Indexes – up to 5x faster writes1 ✔ ✔ 100B+ Bulk Importer1 ✔ Resumable1 Enterprise Cypher Runtime up to 70% faster – ✔ Hot Backups – 2x Faster1 ACID Transactions ✔ ✔ High-performance native API ✔ ✔ High-performance caching ✔ ✔ Cost-based query optimizer ✔ ✔ Graph algorithms library to support AI initiatives ✔ ✔ Massively parallel graph algorithms – ✔ Query monitoring with enriched metrics – ✔ User and role-based security – ✔ LDAP and Active Directory Integration – ✔ Kerberos security option – ✔ Multi-Clustering (partition of clusters)1 – ✔ Automatic Cache Warming1 – ✔ Rolling Upgrades1 – ✔ Resumable Copy/ Restore Cluster Member – ✔ New diagnostic metrics and support tools1 – ✔ Property Blacklisting – ✔ Language drivers for Java, Python, C# & JavaScript ✔ ✔ Bolt Binary Protocol ✔ ✔ RPM, Debian, Docker, Azure & AWS Cloud Delivery ✔ ✔ Intra-cluster encryption secures all traffic across data centers and cloud zones – ✔ IPv6 support in clustered deployments – Available High throughput, least-connected load balancing built into Bolt drivers – ✔ Causal Clustering, core and read-replica design at global scale for applications, analytics workflows, HA and DR – ✔ Enterprise Lock Manager accesses all cores on server – ✔ Labeled property graph model ✔ ✔ Native graph processing & storage ✔ ✔ Cypher graph query language ✔ ✔ Neo4j Browser with syntax highlighting ✔ ✔ Fast writes via native label indexes ✔ ✔ Composite Indexes ✔ ✔ Cypher for Apache Spark (CAPS) for big data analytics ✔ ✔ Graph size limitations 34B nodes None Auto reuse of deleted space – ✔ Property existence constraints – ✔ Cypher query tracing, monitoring and metrics – ✔ Node Key schema constraints – ✔ Neo4j Desktop: Free developer-friendly package with full database and tools – ✔ CommunityDatabase Features Architectural Features Graph Platform Features 1New in Neo4j 3.4 Enterprise Community Enterprise Community Enterprise
  • 27.
    Neo4j Graph PlatformVision 27 Development & Administration Analytics Tooling BUSINESS USERS DEVELOPERS ADMINS Graph Analytics Graph Transactions Data Integration Discovery & Visualization DATA ANALYSTS DATA SCIENTISTS Drivers & APIs APPLICATIONS AI openCypherCloud
  • 28.
    Development & Administration Analytics Tooling Graph Analytics Graph Transactions Data Integration Discovery& VisualizationDrivers & APIs AI Neo4j Database 3.4 & 3.5 • Full Text Search • Native Indexes (up to 5x faster writes) • 100B+ bulk importer Improved Admin Experience • Rolling upgrades • 2x faster backups • Cache Warming on startup • Improved diagnostics Morpheus for Apache Spark • Graph analytics in the data lake • In-memory Spark graphs from Apache Hadoop, Hive, Gremlin and Spark • Save graphs into Neo4j • High-speed data exchange between Neo4j & data lake • Progressive analysis using named graphs Graph Data Science • High speed graph algorithms Neo4j Bloom • New graph illustration and communication tool for non- technical users • Explore and edit graph • Search-based • Create storyboards • Foundation for graph data discovery • Integrated with graph platform Multi-Cluster routing built into Bolt drivers • Date/Time data type • 3-D Geospatial search • Secure, Horizontal Multi-Clustering • Property-value Security The Neo4j Graph Platform
  • 29.
  • 30.
    Neo4j Bloom 31 • Highfidelity • Scene navigation • Property views • Search suggestions • Saved phrase history • Property editor • Schema perspectives • Bloom chart type • Visualize • Communicate • Discover • Navigate • Isolate • Edit • Share
  • 31.
  • 32.
    Graphs Are VERYHungry for Data Graphs’ appetite to connect more data accelerates the ability to find adjacent innovations Customer iteration cycles from 2 weeks to 3 months
  • 33.
    Graph Database Surgingin Popularity 34
  • 34.
    20M+ Downloads 8M+ from Neo4jDistribution 12M+ from Docker Events 400+ Approximate Number of Neo4j Events per Year 50k+ Meetups Number of Meetup Members Globally 50k+ Trained/certified Neo4j professionals 1k Certified Trained Developers Largest Pool of Graph Technologists
  • 35.
    Density Drives ValueIn Graphs Metcalfe’s Law of the Network (V=n2) 5 hops < less Value 100’s of hops deliver immense VALUE
  • 36.
    "Neo4j continues todominate the graph database market.” “69% of enterprises have, or are planning to implement graphs over next 12 months” October, 2017 “The most widely stated reason in the survey for selecting Neo4j was to drive innovation” February, 2018 Critical Capabilities for DBMA “In fact, the rapid rise of Neo4j and other graph technologies may signal that data connectedness is indeed a separate paradigm from the model consolidation happening across the rest of the NoSQL landscape.” March, 2018 Analysts See Unique Benefits of Graphs "Neo4j is the clear market leader in the graph space. It has the most users, it uses a widely adopted language that is much easier than Gremlin and in many respects, it has consistently been a lot more innovative than its competitors.” “It is the Oracle or SQL Server of the graph database world.” March, 2019 "Our research suggests that graph databases have the best chance to survive and thrive as a distinct category (versus the other NoSQL models) because connected data applications present serious performance problems that only a specialized graph DB can solve.” March, 2019
  • 37.
    Neo4j Has aTen Year Head Start Native Connectedness Differentiates Neo4j Conceive Code Compute Store Non-Native Graph DBNative Graph DB RDBMS Optimized for graph workloads
  • 38.
    Graph Database VendorLandscape 3 9 NEO4J SIGNIFICANTLY OUTPACES COMPETITION IN GRAPH LEADERSHIP & INVESTMENT, TECHNOLOGY CAPABILITY, COMMUNITY BREADTH AND PRODUCT MATURITY Graph Pioneer & Leader Architectures optimized for non-graph workloads. Not easily adaptable for graphs. Lack “minutes to milliseconds” performance. Few graph-expert resources Nascent products fall vastly short. Graphs as a checkbox. Slow performance. Playing ‘catch-up,’ requiring years to stabilize & grow. Aggressive posture & claims to secure PR. Many fail the “kill -9” test Graph pioneer & visionary. Largest, most active community. More customer successes than all other vendors combined. Strongest technology. Diverse roadmap: cloud, DBaaS, Spark, Algos for AI, GQL.
  • 39.
    40 Real-Time Recommendations Fraud Detection Network & IT Operations MasterData Management Knowledge Graph Identity & Access Management Common Graph Technology Use Cases AirBnb
  • 40.
    Highly Valuable ConnectedData Use Cases Drive Enterprise Adoption 41 Real-Time Recommendations Fraud Detection Network & IT Operations Master Data Management Identity & Access Management Knowledge Graph
  • 42.
    Background • Over 7Mcitizens suffer from Diabetes • Connecting over 400 researchers • Incorporates over 50 databases, 100k’s of Excel workbooks, 30 database of biological samples • Sought to examine disease from as many angles as possible. Business Problem • Genes are connected by proteins or to metabolites, and patients are connected with their diets, etc… • Needed to improve the utilization of immensely technical data • Needed to cater to doctors and researchers with simple navigation, communication and connections of the graph. Solution and Benefits • Dr. Alexander Jarasch, Head of Bioinformatics and Data Management • Scientists can conduct parallel research without asking the same questions or repeating tests • Built views like a liver sample knowledge graph DZD - German Center for Diabetes Research Medical Genomic Research43 EE Customer since 2016 Q4
  • 43.
    Software Financial Services Telecom Retail &Consumer Goods Media & Entertainment Other Industries Airbus Over 300 Enterprises and 10s of Thousands of Projects on Neo4j
  • 44.
    Background • Fortune 100heavy equipment manufacturer • 27 Million warranty & service documents parsed • Foundation for AI-based supply chain management Business Problem • Improve maintenance predictability • Need a knowledge base for 27 million warranty documents and maintenance orders • Graphs gather context for AI to identify ‘prime examples’ of connections among parts, suppliers, customers and their mechanics anticipate when equipment will need servicing and by whom. Solution and Benefits • Text to knowledge graph • Common ontology for complaints, symptoms & parts • Anticipates when equipment will need servicing • Improves customer and brand satisfaction • Maximizes lifespan and value of equipment Caterpillar Heavy Equipment Manufacturing Parts Assembly & Equipment Maintenance45
  • 45.
    7 of theTop 10 Software Companies Use Neo4j
  • 46.
    Background • Social networkof 10M graphic artists • Peer-to-peer evaluation of art and works-in-progress • Job sourcing site for creatives • Massive, millions of updates (reads & writes) to Activity Feed • 150 Mongos to 48 Cassandras to 3 Neo4j’s! Business Problem • Artists subscribe, appreciate and curate “galleries” of works of their own and from other artists • Activities Feed is how everyone receives updates • 1st implementation was 150 MongoDB instances • 2nd implementation shrunk to 48 Cassandras, but it was still too slow and required heavy IT overhead Solution and Benefits • 3rd implementation shrunk to 3 Neo4j instances • Saved over $500k in annual AWS fees • Reduced data footprint from 50TB to 40GB • Significantly easier to introduce new features like, “New projects in you Network” Adobe Behance Social Network of 10M Graphic Artists Social Network47 EE Customer since 2016 Q4
  • 47.
    8 of theTop 10 Insurance Companies Use Neo4j
  • 49.
  • 50.
  • 51.
    Background • Largest CableTV & Internet Provider in US • 3rd Largest network on the planet • xFi is consumer experience in 3M houses • Internet, router, devices, security, voice & telephony • Transformational customer experience Business Problem • Integrate all experience in a smart home • Create innovative ideas based on cross-platform and household member preferences • Add integrated value of xFinity triple play & quad- play services (internet, VoIP, cable TV & home security) Solution and Benefits • Custom content per household member • Security reminders (kids are home, garage left open) • Serves millions of households • Makes content recommendations based on occupant, time of day, permissions and preferences • Has Siri-like voice commands COMCAST Xfinity xFi TELECOMMUNICATIONS Smart Home / Internet of Things52 EE Customer since 2016 Q4
  • 52.
    Analog to DigitalInnovations
  • 53.
    Common Graph Entitiesare Analog People Locations Processes Devices Objects Motives • Who – People • What – Activities & Events • Where – Locations • When – Time • Why – Motives & Feelings • How – Processes, Devices & Networks Activities
  • 54.
    The Whiteboard ModelIs the Physical Model 55 Ideation is an analog activity • Easily understood • Easily evolved • Easy collaboration between business and IT
  • 55.
  • 56.
  • 57.
    Graphs Drive Innovation 58 ContextPaths Auto-Graphs Graph Layers 1st Order Graph Cross-Connect Cross-tech applications Internet of Things operations Transparent Neural Networks Blockchain-managed systems Adjacent graph layers inspire new innovations Metadata / Risk Management Knowledge Graphs AI- Powered Customer Experiences Connect unlike objects such as people to products, locations Mobile app explosion Recommendation engines Fraud detectors Desire for more context to follow connections Extract properties during traversals Connects like objects People, computer networks, telco, etc
  • 58.
    Cypher: Powerful andExpressive Query Language MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse) MARRIED_TO Dan Ann NODE RELATIONSHIP TYPE LABEL PROPERTY VARIABLE
  • 59.
    60 The GQL Manifesto:https://gql.today/ • Introduced in May 2018: https://gql.today/ • An initiative to immediately rally support for a unified Graph Query Language • Standards meetings are ongoing • All community members are encouraged to Vote their support at https://gql.today/#vote
  • 60.
  • 61.
    Data Sources CLIENT AdminDashboard Session Data Feedback Scored Recommen- dations Graph Algorithms AI / ML Click Stream Data INTELLIGENT RECOMMENDATIONS FRAMEWORK Discovery Exclude Boost Diversity User Segmentation Item Similarity Intelligent Recommendations Framework Recommendation Engines 62
  • 62.
    Graph Analytics Graph Algorithms Cypher for ApacheSpark™ Graph-Enhanced AI & ML Similarity ML
  • 63.
    Graph & MLAlgorithms in Neo4j +35 neo4j.com/ graph- algorithms- book/ Pathfinding & Search Centrality / Importance Community Detection Link Prediction Finds optimal paths or evaluates route availability and quality Determines the importance of distinct nodes in the network Detects group clustering or partition options Evaluates how alike nodes are Estimates the likelihood of nodes forming a future relationship Similarity
  • 64.
    Graph and MLAlgorithms in Neo4j • Parallel Breadth First Search & DFS • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • Minimum Spanning Tree • A* Shortest Path • Yen’s K Shortest Path • K-Spanning Tree (MST) • Random Walk • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality • Approximate Betweenness Centrality • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity – 1 Step & Multi- Step • Balanced Triad (identification) • Euclidean Distance • Cosine Similarity • Jaccard Similarity • Overlap Similarity • Pearson Similarity Pathfinding & Search Centrality / Importance Community Detection Similarity neo4j.com/docs/ graph-algorithms/current/ Updated April 2019 Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors
  • 65.
    Graph Analytics: SparkCypher &Morpheus Objective: Draw new users from the Spark ecosystem to graphs & Neo4j (Also bolsters Cypher as the de-facto query language)
  • 66.
  • 67.
    EVIDENCE BASED MACHINE LEARNING SYSTEMS PRESCRIPTE ANALYTICS NATURAL LANGUAGEGENERATION “Yankees” “Giants” “Penguins” “Jets” “Bears” “Red Soxs” NLP/TEXT MINING PREDICITVE ANALYTICS RECOMMENDATION ENGINES DEEP LEARNING
  • 68.
  • 69.
  • 70.
    Thomson Reuters Graph 71 •Data Fusion for Portfolio Managers • Graph layers
  • 71.
    1. Knowledge Graphs Contextfor Decisions 2. Connected Feature Extraction Context for Credibility 4. AI Explainability3. Graph- Accelerated AI Context for Efficiency Context for Accuracy Four Pillars of Graph-Enhanced AI
  • 72.
  • 73.
    Data Network Effect “Aproduct, generally powered by machine learning, becomes smarter as it gets more data from your users. The more users use your product, the more data they contribute; the more data they contribute, the smarter your product becomes.” — Matt Turck
  • 74.
  • 76.
  • 78.