SlideShare a Scribd company logo
www.Objectivity.com

An Introduction To
Graph Databases
Leon Guzenda & Nick Quinn
August 20, 2013
Overview
• Introductions
• Graph Theory
• Commonly Used Graph Algorithms
• Graph Databases
• Current Implementations
• Use Cases
• Hands-On Tutorial
We Are From Objectivity Inc.
Company

• Objectivity, Inc. is headquartered in Sunnyvale, CA.
• Established in 1988 to tackle database problems that network/hierarchical/relational and file-based technologies
struggle with.
• Objectivity has over two decades of Big Data and NoSQL experience

Products

• Develops NoSQL platforms for managing and discovering relationships and patterns in complex data:
• Objectivity/DB - an object database that manages localized, centralized or distributed databases
• InfiniteGraph

- a massively scalable graph database built on Objectivity/DB that enables

organizations to find, store and exploit the relationships in their data

Markets

• The Big Data market is projected to be around $12B in 2012, with a CAGR of 28% over the next five years.
• 40% per year data growth, cloud adoption, mobile usage and improved real-time analytics underpin Objectivity’s
growth opportunities as a Big Data analytics enabler.

Customers

• Embedded in hundreds of enterprises, government organizations and products - millions of deployments.

Financials

• Consistently generates increased revenues.
• Privately held by the employees and a few venture capital companies.

Copyright © Objectivity, Inc. 2012
GRAPH THEORY
The History of Graph Theory
1736: Leonard Euler writes a paper on the “Seven Bridges of Konisberg”
1845: Gustav Kirchoff publishes his electrical circuit laws
1852: Francis Guthrie poses the “Four Color Problem”
1878: Sylvester publishes an article in Nature magazine that describes graphs
1936: Dénes Kőnig publishes a textbook on Graph Theory
1941: Ramsey and Turán define Extremal Graph Theory
1959: De Bruijn publishes a paper summarizing Enumerative Graph Theory
1959: Erdos, Renyi and Gilbert define Random Graph Theory
1969: Heinrich Heesch solves the “Four Color” problem
2003: Commercial Graph Database products start appearing on the market
Graph Theory Terminology...
VERTEX: A single node in a graph data structure
EDGE: A connection between a pair of VERTICES
PROPERTIES: Data items that belong to a particular Vertex or Edge
WEIGHT: A quantity associated with a particular Edge
GRAPH: A network of linked Vertex and Edge objects

Vertex 1
City: San Francisco
Pop: 812,826

Edge 1
Road: I-101
Miles: 47.8

Vertex 2
City: San Jose
Pop: 967,487
...Graph Theory Terminology...
SIMPLE/UNDIRECTED GRAPH: A Graph where each VERTEX may be linked to

one or more Vertex objects via Edge objects and each Edge object is connected to
exactly two Vertex objects. Furthermore, neither Vertex connected to an Edge is more
significant than the other.

DIRECTED GRAPH: A Simple/Undirected Graph where one Vertex in a

Vertex + Edge + Vertex group (an “Arc” or “Path”) can be considered the “Head” of the
Path and the other can be considered the “Tail”.

MIXED GRAPH: A Graph in which some paths are Undirected and others are
Directed.
...Graph Theory Terminology
LOOP: An Edge that is doubly-linked to the same Vertex
MULTIGRAPH: A Graph that allows multiple Edges and Loops
QUIVER: A Graph where Vertices are allowed to be connected by multiple Arcs.
A Quiver may include Loops.

WEIGHTED GRAPH: A Graph where a quantity is assigned to an Edge, e.g.

a Length assigned to an Edge representing a road between two Vertices representing
cities.

 HALF EDGE: An Edge that is only connected to a single Vertex
 LOOSE EDGE: An Edge that isn't connected to any Vertices.
 CONNECTIVITY: Two Vertices are Connected if it is possible to find a path between
them.
COMMONLY USED GRAPH ALGORITHMS

Mac Evans
Commonly Used Graph Algorithms...
CONNECTEDNESS: Check whether or not a set of nodes in a Graph are connected.
All of the nodes in the graph below are connected, e.g. A to B, A to C via B etc.

SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes.
In the graph above, A->B->C->D is shorter than A->B->C->B->D (disallowing loops)

NODE DEGREE: The degree of a node in a network is a count of the number of

connections it has to other nodes. The degree distribution is the probability distribution of
these degrees in the whole network.
In the graph below, A and D have a node degree of 1. B and C have a node degree of 3.
...Commonly Used Graph Algorithms...
CENTRALITY: An assessment of the importance of a node within a network.
Degree Centrality is the simplest, being a count of the number of connections that a node has.
It may be expressed as “Indegree” (# of incoming connections) and “Outdegre” (# of outgoing
connections).
...Commonly Used Graph Algorithms...
CLOSENESS CENTRALITY: Closeness considers the shortest paths between nodes and
assigns a higher value to nodes that can be used to reach most other nodes most quickly.

In the graph below, node A has the greatest centrality as all other nodes can be reached in one
“hop”, whereas others require 1 hop to A or 2 hops to any other node.

A
Commonly Used Graph Algorithms...
CONNECTEDNESS: Check whether or not a set of nodes in a Graph are connected.
All of the nodes in the graph below are connected, e.g. A to B, A to C via B etc.

SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes.
In the graph above, A->B->C->D is shorter than A->B->C->B->D (disallowing loops)

NODE DEGREE: The degree of a node in a network is a count of the number of

connections it has to other nodes. The degree distribution is the probability distribution of
these degrees in the whole network.
In the graph below, A and D have a node degree of 1. B andC have a node degree of 3.
...Commonly Used Graph Algorithms...
SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes.
In the graph below, A->B->C->D is shorter than A->B->C->B->D (disallowing loops)

AVERAGE PATH LENGTH: The average of all path lengths between all pairs of nodes in a
graph.

TRANSITIVE CLOSURE: The process of exploring a graph by traversing relationships
until all nodes have been visited, but without revisiting nodes that are joined together in
loops.
In the graph above, A->B->C->D is a transitive closure.
...Commonly Used Graph Algorithms...
GRAPH DIAMETER (or SPAN): The greatest distance between any pair of nodes in a graph.
It is computed by finding the shortest path between each pair of nodes. The maximum of these
path lengths is a measure of the diameter of the graph.
The diameters of the two graphs below are 2 and 5.
...Commonly Used Graph Algorithms...
BETWEENESS CENTRALITY: A centrality measure of a node within a graph.
Nodes that have a high probability of being visited on a randomly chosen short path between two
randomly chosen nodes have a high “betweeness”
In the graph below, node D has the highest betweeness centrality.
GRAPH DATABASES
Recognizing Graphs In Object Models...
Tree Structures
1-to-Many

Object Class A
...Recognizing Graphs In Object Models...
Tree Structures
1-to-Many

Relationship
Data

Object Class A

Object Class A
Recognizing Graphs In Object Models...
Tree Structures
1-to-Many

Relationship
Data

Object Class A

Object Class A

Graph (Network) Structures
Many-to-Many

Object Class A
Recognizing Graphs In Object Models...
Tree Structures
1-to-Many

Relationship
Data

Object Class A

Object Class A

Graph (Network) Structures
Many-to-Many

Relationship Data

Object Class A

Object Class A

Copyright © Objectivity, Inc. 2012
Why Do We Need Graph DBMSs?...
Relational Database
Think about the SQL query for finding all links between the two “blue” rows... Good luck!
Table_A

Table_B

Table_C

Table_D

Table_E

Table_F

Table_G

Relational databases aren’t good at handling complex relationships!
...Graph DBMSs Are Designed To Handle Relationships
Relational Database
Think about the SQL query for finding all links between the two “blue” rows... Good luck!
Table_A

Table_B

Table_C

Table_D

Table_E

Table_F

Table_G

Objectivity/DB or InfiniteGraph - The solution can be found with a few lines of code
A3

G4
Graph Databases
• Data model:
– Node (Vertex) and Relationship (Edge) objects
– Directed
– May be a hypergraph (edges with multiple endpoints)

• Examples:
– InfiniteGraph, Neo4j, OrientDB, AllegroGraph, TitanDB and Dex

VERTEX

2

N

EDGE
Graph DBMSs Use A Very Simple Object Model
Tree Structures
1-to-Many

Relationship
Data

Object Class A

Object Class A

Graph (Network) Structures

GRAPH MODEL

Many-to-Many

Relationship Data

EDGE

Object Class A

Object Class A

VERTEX

Copyright © Objectivity, Inc. 2012
Basic Capabilities Of Most Graph Databases...
Rapid Graph Traversal

Start

Finish
...Basic Capabilities Of Most Graph Databases...
Rapid Graph Traversal

Inclusive or Exclusive Selection

X
Start

Start

X
...Basic Capabilities Of Most Graph Databases
Rapid Graph Traversal

Inclusive or Exclusive Selection

X
Start

Start

X
Find the Shortest or All Paths Between Objects

Start

Finish
InfiniteGraph Capabilities
Parallel Graph Traversal

Inclusive or Exclusive Selection

X
Start

Start

X
Shortest or All Paths Between Objects

Computational & Visualization Plug-Ins
Compute Cost To Date

Start

Finish

Start

Visualize

Copyright © Objectivity, Inc. 2013
CURRENT IMPLEMENTATIONS
Graph Databases Pre-2003
Graph Databases Post-2003

X
Titan
Graph Databases Compared [UNSW]
DATA STORAGE FEATURES
Graph Databases Compared [DZone]

Source: http://goo.gl/ni4eoE
Graph Databases – Pros and Cons
• Strengths:
– Extremely fast for connected data
– Scales out, typically
– Easy to query (navigation)
– Simple data model

• Weaknesses:
– May not support distribution or sharding
– Requires conceptual shift... a different way of thinking

VERTEX

2

N

EDGE
USE CASES
Example 1 - Market Analysis
The 10 companies that control a majority of U.S. consumer goods brands
Example 2 - Demographics
Used in social network analysis, marketing, medical research etc.
Example 3 - Seed To Consumer Tracking

?
Example 4 - Ad Placement Networks
Smartphone Ad placement - based on the the user’s profile and location data
captured by opt-in applications.
• The location data can be stored and distilled in a key-value and column store
hybrid database, such as Cassandra
• The locations are matched with geospatial data to deduce user interests.
• As Ad placement orders arrive, an application built on a graph database such
as InfiniteGraph, matches groups of users with Ads:
• Maximizes relevance for the user.
• Yields maximum value for the advertiser and the placer.
Example 4 - Ad Placement Networks
Smartphone Ad placement - based on the the user’s profile and location data
captured by opt-in applications.
• The location data can be stored and distilled in a key-value and column store
hybrid database, such as Cassandra
• The locations are matched with geospatial data to deduce user interests.
• As Ad placement orders arrive, an application built on a graph database such
as InfiniteGraph, matches groups of users with Ads:
• Maximizes relevance for the user.
• Yields maximum value for the advertiser and the placer.
Example 5 - Healthcare Informatics

Problem: Physicians need better electronic records for managing patient data on a global
basis and match symptoms, causes, treatments and interdependencies to improve
diagnoses and outcomes.
• Solution: Create a database capable of leveraging existing architecture using NOSQL tools
such as Objectivity/DB and InfiniteGraph that can handle data capture, symptoms,
diagnoses, treatments, reactions to medications, interactions and progress.
• Result: It works:
• Diagnosis is faster and more accurate
• The knowledge base tracks similar medical cases.
• Treatment success rates have improved.
Example 6 - Big Data Analytics
Example 7 – Visual Analytics
Hands On With A Graph Database

• We'll be using InfiniteGraph today
• You'll need a Java Development environment on your machine

• If you haven't downloaded InfiniteGraph already, please go to:
http://goo.gl/XzJo6T [https://download.infinitegraph.com/index.aspx]

• We'll be covering a HelloGraph and a more complex sample program

More Related Content

What's hot (20)

PDF
Big Data Architecture
Guido Schmutz
 
PPTX
Databricks Platform.pptx
Alex Ivy
 
PDF
Introducing Neo4j
Neo4j
 
PPTX
Columnar Databases (1).pptx
ssuser55cbdb
 
PPTX
Azure data platform overview
James Serra
 
PDF
Introduction to Data Engineer and Data Pipeline at Credit OK
Kriangkrai Chaonithi
 
PPTX
Introduction to Graph Databases
Max De Marzi
 
PDF
Intro to Delta Lake
Databricks
 
PDF
Big Data Visualization
Raffael Marty
 
PPTX
NOSQL Databases types and Uses
Suvradeep Rudra
 
PPTX
Big data
Nausheen Hasan
 
PDF
Architecting Modern Data Platforms
Ankit Rathi
 
PDF
Data Mesh 101
ChrisFord803185
 
PPTX
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
Simplilearn
 
PPTX
Data streaming fundamentals
Mohammed Fazuluddin
 
PDF
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
PPT
NOSQL Database: Apache Cassandra
Folio3 Software
 
PPTX
PPT on Hadoop
Shubham Parmar
 
PPTX
Hadoop
ABHIJEET RAJ
 
Big Data Architecture
Guido Schmutz
 
Databricks Platform.pptx
Alex Ivy
 
Introducing Neo4j
Neo4j
 
Columnar Databases (1).pptx
ssuser55cbdb
 
Azure data platform overview
James Serra
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Kriangkrai Chaonithi
 
Introduction to Graph Databases
Max De Marzi
 
Intro to Delta Lake
Databricks
 
Big Data Visualization
Raffael Marty
 
NOSQL Databases types and Uses
Suvradeep Rudra
 
Big data
Nausheen Hasan
 
Architecting Modern Data Platforms
Ankit Rathi
 
Data Mesh 101
ChrisFord803185
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
Simplilearn
 
Data streaming fundamentals
Mohammed Fazuluddin
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
NOSQL Database: Apache Cassandra
Folio3 Software
 
PPT on Hadoop
Shubham Parmar
 
Hadoop
ABHIJEET RAJ
 

Viewers also liked (14)

PPTX
Relational databases vs Non-relational databases
James Serra
 
PPTX
Neo4j - graph database for recommendations
proksik
 
PPTX
Lju Lazarevic
Connected Data World
 
KEY
NoSQL: Why, When, and How
BigBlueHat
 
PDF
Relational vs. Non-Relational
PostgreSQL Experts, Inc.
 
PPTX
Relational to Graph - Import
Neo4j
 
PDF
Designing and Building a Graph Database Application – Architectural Choices, ...
Neo4j
 
PDF
Converting Relational to Graph Databases
Antonio Maccioni
 
PDF
Graph Database, a little connected tour - Castano
Codemotion
 
PDF
Graph Based Recommendation Systems at eBay
DataStax Academy
 
PDF
Introduction to graph databases GraphDays
Neo4j
 
PPTX
An Introduction to NOSQL, Graph Databases and Neo4j
Debanjan Mahata
 
PDF
Data Modeling with Neo4j
Neo4j
 
PPTX
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Relational databases vs Non-relational databases
James Serra
 
Neo4j - graph database for recommendations
proksik
 
Lju Lazarevic
Connected Data World
 
NoSQL: Why, When, and How
BigBlueHat
 
Relational vs. Non-Relational
PostgreSQL Experts, Inc.
 
Relational to Graph - Import
Neo4j
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Neo4j
 
Converting Relational to Graph Databases
Antonio Maccioni
 
Graph Database, a little connected tour - Castano
Codemotion
 
Graph Based Recommendation Systems at eBay
DataStax Academy
 
Introduction to graph databases GraphDays
Neo4j
 
An Introduction to NOSQL, Graph Databases and Neo4j
Debanjan Mahata
 
Data Modeling with Neo4j
Neo4j
 
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Ad

Similar to An Introduction to Graph Databases (20)

PPTX
Graph in data structures
AhsanRazaKolachi
 
PPTX
Análisis llamadas telefónicas con Teoría de Grafos y R
Rafael Nogueras
 
PDF
Graph Analyses with Python and NetworkX
Benjamin Bengfort
 
PDF
Node Path Visualizer Using Shortest Path Algorithms
IRJET Journal
 
PPTX
Directed Graph in Graph Theory and Combinatorics.pptx
AaradhyaDixit6
 
PPTX
Social Network Analysis and Visualization
Alberto Ramirez
 
PPTX
Data Structure Graph DMZ #DMZone
Doug Needham
 
PPTX
dms slide discrete mathematics sem 2 engineering
pranavstar99
 
PPTX
Lecture 2.3.1 Graph.pptx
king779879
 
PPTX
Graphs data structures
Jasleen Kaur (Chandigarh University)
 
PPTX
data structures and algorithms Unit 2
infanciaj
 
PDF
Create swath profiles in GRASS GIS
Skyler Sorsby
 
PPTX
Apache Spark GraphX highlights.
Doug Needham
 
PPTX
Spanning Tree in data structure and .pptx
asimshahzad8611
 
PDF
Edge Representation Learning with Hypergraphs
MLAI2
 
PPTX
SOCIAL NETWORK ANALYISI in engeenireg.pptx
urvashipundir04
 
PPTX
CPP Homework Help
C++ Homework Help
 
PPTX
Graph-Theory-The-Foundations-of-Modern-Networks.pptx
killeromm95
 
PDF
Ijcnc050213
IJCNCJournal
 
PPTX
Graph-terminology.pptx
sharlinE4
 
Graph in data structures
AhsanRazaKolachi
 
Análisis llamadas telefónicas con Teoría de Grafos y R
Rafael Nogueras
 
Graph Analyses with Python and NetworkX
Benjamin Bengfort
 
Node Path Visualizer Using Shortest Path Algorithms
IRJET Journal
 
Directed Graph in Graph Theory and Combinatorics.pptx
AaradhyaDixit6
 
Social Network Analysis and Visualization
Alberto Ramirez
 
Data Structure Graph DMZ #DMZone
Doug Needham
 
dms slide discrete mathematics sem 2 engineering
pranavstar99
 
Lecture 2.3.1 Graph.pptx
king779879
 
Graphs data structures
Jasleen Kaur (Chandigarh University)
 
data structures and algorithms Unit 2
infanciaj
 
Create swath profiles in GRASS GIS
Skyler Sorsby
 
Apache Spark GraphX highlights.
Doug Needham
 
Spanning Tree in data structure and .pptx
asimshahzad8611
 
Edge Representation Learning with Hypergraphs
MLAI2
 
SOCIAL NETWORK ANALYISI in engeenireg.pptx
urvashipundir04
 
CPP Homework Help
C++ Homework Help
 
Graph-Theory-The-Foundations-of-Modern-Networks.pptx
killeromm95
 
Ijcnc050213
IJCNCJournal
 
Graph-terminology.pptx
sharlinE4
 
Ad

More from InfiniteGraph (20)

PDF
Making Sense of Graph Databases
InfiniteGraph
 
PPTX
Webinar 3/12/14: Using Social Media to Drive Value
InfiniteGraph
 
PDF
NoSQL Simplified: Schema vs. Schema-less
InfiniteGraph
 
PDF
The Value of Explicit Schema for Graph Use Cases
InfiniteGraph
 
PDF
Solution Use Case Demo: The Power of Relationships in Your Big Data
InfiniteGraph
 
PDF
PowerOfRelationshipsInBigData_SVNoSQL
InfiniteGraph
 
PPT
Objectivity/DB: A Multipurpose NoSQL Database
InfiniteGraph
 
PPT
Making sense of the Graph Revolution
InfiniteGraph
 
PDF
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
InfiniteGraph
 
PPT
Turning Big Data into Smart Data with Graph Technologies
InfiniteGraph
 
PPTX
NoSQL Technology and Real-time, Accurate Predictive Analytics
InfiniteGraph
 
PPTX
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
InfiniteGraph
 
PDF
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
InfiniteGraph
 
PPTX
Vodafone xone fev142013v3 ext
InfiniteGraph
 
PDF
Dbta Webinar Realize Value of Big Data with graph 011713
InfiniteGraph
 
PDF
Oracle no sql overview brief
InfiniteGraph
 
PPT
Infinite graph nosql meetup dec 2012
InfiniteGraph
 
PDF
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
InfiniteGraph
 
PPTX
Silicon valley nosql meetup april 2012
InfiniteGraph
 
PPT
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
InfiniteGraph
 
Making Sense of Graph Databases
InfiniteGraph
 
Webinar 3/12/14: Using Social Media to Drive Value
InfiniteGraph
 
NoSQL Simplified: Schema vs. Schema-less
InfiniteGraph
 
The Value of Explicit Schema for Graph Use Cases
InfiniteGraph
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
InfiniteGraph
 
PowerOfRelationshipsInBigData_SVNoSQL
InfiniteGraph
 
Objectivity/DB: A Multipurpose NoSQL Database
InfiniteGraph
 
Making sense of the Graph Revolution
InfiniteGraph
 
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
InfiniteGraph
 
Turning Big Data into Smart Data with Graph Technologies
InfiniteGraph
 
NoSQL Technology and Real-time, Accurate Predictive Analytics
InfiniteGraph
 
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
InfiniteGraph
 
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
InfiniteGraph
 
Vodafone xone fev142013v3 ext
InfiniteGraph
 
Dbta Webinar Realize Value of Big Data with graph 011713
InfiniteGraph
 
Oracle no sql overview brief
InfiniteGraph
 
Infinite graph nosql meetup dec 2012
InfiniteGraph
 
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
InfiniteGraph
 
Silicon valley nosql meetup april 2012
InfiniteGraph
 
NOSQL Now! Presentation, August 24, 2011: Graph Databases: Connecting the Dot...
InfiniteGraph
 

Recently uploaded (20)

PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 

An Introduction to Graph Databases

  • 1. www.Objectivity.com An Introduction To Graph Databases Leon Guzenda & Nick Quinn August 20, 2013
  • 2. Overview • Introductions • Graph Theory • Commonly Used Graph Algorithms • Graph Databases • Current Implementations • Use Cases • Hands-On Tutorial
  • 3. We Are From Objectivity Inc. Company • Objectivity, Inc. is headquartered in Sunnyvale, CA. • Established in 1988 to tackle database problems that network/hierarchical/relational and file-based technologies struggle with. • Objectivity has over two decades of Big Data and NoSQL experience Products • Develops NoSQL platforms for managing and discovering relationships and patterns in complex data: • Objectivity/DB - an object database that manages localized, centralized or distributed databases • InfiniteGraph - a massively scalable graph database built on Objectivity/DB that enables organizations to find, store and exploit the relationships in their data Markets • The Big Data market is projected to be around $12B in 2012, with a CAGR of 28% over the next five years. • 40% per year data growth, cloud adoption, mobile usage and improved real-time analytics underpin Objectivity’s growth opportunities as a Big Data analytics enabler. Customers • Embedded in hundreds of enterprises, government organizations and products - millions of deployments. Financials • Consistently generates increased revenues. • Privately held by the employees and a few venture capital companies. Copyright © Objectivity, Inc. 2012
  • 5. The History of Graph Theory 1736: Leonard Euler writes a paper on the “Seven Bridges of Konisberg” 1845: Gustav Kirchoff publishes his electrical circuit laws 1852: Francis Guthrie poses the “Four Color Problem” 1878: Sylvester publishes an article in Nature magazine that describes graphs 1936: Dénes Kőnig publishes a textbook on Graph Theory 1941: Ramsey and Turán define Extremal Graph Theory 1959: De Bruijn publishes a paper summarizing Enumerative Graph Theory 1959: Erdos, Renyi and Gilbert define Random Graph Theory 1969: Heinrich Heesch solves the “Four Color” problem 2003: Commercial Graph Database products start appearing on the market
  • 6. Graph Theory Terminology... VERTEX: A single node in a graph data structure EDGE: A connection between a pair of VERTICES PROPERTIES: Data items that belong to a particular Vertex or Edge WEIGHT: A quantity associated with a particular Edge GRAPH: A network of linked Vertex and Edge objects Vertex 1 City: San Francisco Pop: 812,826 Edge 1 Road: I-101 Miles: 47.8 Vertex 2 City: San Jose Pop: 967,487
  • 7. ...Graph Theory Terminology... SIMPLE/UNDIRECTED GRAPH: A Graph where each VERTEX may be linked to one or more Vertex objects via Edge objects and each Edge object is connected to exactly two Vertex objects. Furthermore, neither Vertex connected to an Edge is more significant than the other. DIRECTED GRAPH: A Simple/Undirected Graph where one Vertex in a Vertex + Edge + Vertex group (an “Arc” or “Path”) can be considered the “Head” of the Path and the other can be considered the “Tail”. MIXED GRAPH: A Graph in which some paths are Undirected and others are Directed.
  • 8. ...Graph Theory Terminology LOOP: An Edge that is doubly-linked to the same Vertex MULTIGRAPH: A Graph that allows multiple Edges and Loops QUIVER: A Graph where Vertices are allowed to be connected by multiple Arcs. A Quiver may include Loops. WEIGHTED GRAPH: A Graph where a quantity is assigned to an Edge, e.g. a Length assigned to an Edge representing a road between two Vertices representing cities.  HALF EDGE: An Edge that is only connected to a single Vertex  LOOSE EDGE: An Edge that isn't connected to any Vertices.  CONNECTIVITY: Two Vertices are Connected if it is possible to find a path between them.
  • 9. COMMONLY USED GRAPH ALGORITHMS Mac Evans
  • 10. Commonly Used Graph Algorithms... CONNECTEDNESS: Check whether or not a set of nodes in a Graph are connected. All of the nodes in the graph below are connected, e.g. A to B, A to C via B etc. SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes. In the graph above, A->B->C->D is shorter than A->B->C->B->D (disallowing loops) NODE DEGREE: The degree of a node in a network is a count of the number of connections it has to other nodes. The degree distribution is the probability distribution of these degrees in the whole network. In the graph below, A and D have a node degree of 1. B and C have a node degree of 3.
  • 11. ...Commonly Used Graph Algorithms... CENTRALITY: An assessment of the importance of a node within a network. Degree Centrality is the simplest, being a count of the number of connections that a node has. It may be expressed as “Indegree” (# of incoming connections) and “Outdegre” (# of outgoing connections).
  • 12. ...Commonly Used Graph Algorithms... CLOSENESS CENTRALITY: Closeness considers the shortest paths between nodes and assigns a higher value to nodes that can be used to reach most other nodes most quickly. In the graph below, node A has the greatest centrality as all other nodes can be reached in one “hop”, whereas others require 1 hop to A or 2 hops to any other node. A
  • 13. Commonly Used Graph Algorithms... CONNECTEDNESS: Check whether or not a set of nodes in a Graph are connected. All of the nodes in the graph below are connected, e.g. A to B, A to C via B etc. SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes. In the graph above, A->B->C->D is shorter than A->B->C->B->D (disallowing loops) NODE DEGREE: The degree of a node in a network is a count of the number of connections it has to other nodes. The degree distribution is the probability distribution of these degrees in the whole network. In the graph below, A and D have a node degree of 1. B andC have a node degree of 3.
  • 14. ...Commonly Used Graph Algorithms... SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes. In the graph below, A->B->C->D is shorter than A->B->C->B->D (disallowing loops) AVERAGE PATH LENGTH: The average of all path lengths between all pairs of nodes in a graph. TRANSITIVE CLOSURE: The process of exploring a graph by traversing relationships until all nodes have been visited, but without revisiting nodes that are joined together in loops. In the graph above, A->B->C->D is a transitive closure.
  • 15. ...Commonly Used Graph Algorithms... GRAPH DIAMETER (or SPAN): The greatest distance between any pair of nodes in a graph. It is computed by finding the shortest path between each pair of nodes. The maximum of these path lengths is a measure of the diameter of the graph. The diameters of the two graphs below are 2 and 5.
  • 16. ...Commonly Used Graph Algorithms... BETWEENESS CENTRALITY: A centrality measure of a node within a graph. Nodes that have a high probability of being visited on a randomly chosen short path between two randomly chosen nodes have a high “betweeness” In the graph below, node D has the highest betweeness centrality.
  • 18. Recognizing Graphs In Object Models... Tree Structures 1-to-Many Object Class A
  • 19. ...Recognizing Graphs In Object Models... Tree Structures 1-to-Many Relationship Data Object Class A Object Class A
  • 20. Recognizing Graphs In Object Models... Tree Structures 1-to-Many Relationship Data Object Class A Object Class A Graph (Network) Structures Many-to-Many Object Class A
  • 21. Recognizing Graphs In Object Models... Tree Structures 1-to-Many Relationship Data Object Class A Object Class A Graph (Network) Structures Many-to-Many Relationship Data Object Class A Object Class A Copyright © Objectivity, Inc. 2012
  • 22. Why Do We Need Graph DBMSs?... Relational Database Think about the SQL query for finding all links between the two “blue” rows... Good luck! Table_A Table_B Table_C Table_D Table_E Table_F Table_G Relational databases aren’t good at handling complex relationships!
  • 23. ...Graph DBMSs Are Designed To Handle Relationships Relational Database Think about the SQL query for finding all links between the two “blue” rows... Good luck! Table_A Table_B Table_C Table_D Table_E Table_F Table_G Objectivity/DB or InfiniteGraph - The solution can be found with a few lines of code A3 G4
  • 24. Graph Databases • Data model: – Node (Vertex) and Relationship (Edge) objects – Directed – May be a hypergraph (edges with multiple endpoints) • Examples: – InfiniteGraph, Neo4j, OrientDB, AllegroGraph, TitanDB and Dex VERTEX 2 N EDGE
  • 25. Graph DBMSs Use A Very Simple Object Model Tree Structures 1-to-Many Relationship Data Object Class A Object Class A Graph (Network) Structures GRAPH MODEL Many-to-Many Relationship Data EDGE Object Class A Object Class A VERTEX Copyright © Objectivity, Inc. 2012
  • 26. Basic Capabilities Of Most Graph Databases... Rapid Graph Traversal Start Finish
  • 27. ...Basic Capabilities Of Most Graph Databases... Rapid Graph Traversal Inclusive or Exclusive Selection X Start Start X
  • 28. ...Basic Capabilities Of Most Graph Databases Rapid Graph Traversal Inclusive or Exclusive Selection X Start Start X Find the Shortest or All Paths Between Objects Start Finish
  • 29. InfiniteGraph Capabilities Parallel Graph Traversal Inclusive or Exclusive Selection X Start Start X Shortest or All Paths Between Objects Computational & Visualization Plug-Ins Compute Cost To Date Start Finish Start Visualize Copyright © Objectivity, Inc. 2013
  • 33. Graph Databases Compared [UNSW] DATA STORAGE FEATURES
  • 34. Graph Databases Compared [DZone] Source: http://goo.gl/ni4eoE
  • 35. Graph Databases – Pros and Cons • Strengths: – Extremely fast for connected data – Scales out, typically – Easy to query (navigation) – Simple data model • Weaknesses: – May not support distribution or sharding – Requires conceptual shift... a different way of thinking VERTEX 2 N EDGE
  • 37. Example 1 - Market Analysis The 10 companies that control a majority of U.S. consumer goods brands
  • 38. Example 2 - Demographics Used in social network analysis, marketing, medical research etc.
  • 39. Example 3 - Seed To Consumer Tracking ?
  • 40. Example 4 - Ad Placement Networks Smartphone Ad placement - based on the the user’s profile and location data captured by opt-in applications. • The location data can be stored and distilled in a key-value and column store hybrid database, such as Cassandra • The locations are matched with geospatial data to deduce user interests. • As Ad placement orders arrive, an application built on a graph database such as InfiniteGraph, matches groups of users with Ads: • Maximizes relevance for the user. • Yields maximum value for the advertiser and the placer.
  • 41. Example 4 - Ad Placement Networks Smartphone Ad placement - based on the the user’s profile and location data captured by opt-in applications. • The location data can be stored and distilled in a key-value and column store hybrid database, such as Cassandra • The locations are matched with geospatial data to deduce user interests. • As Ad placement orders arrive, an application built on a graph database such as InfiniteGraph, matches groups of users with Ads: • Maximizes relevance for the user. • Yields maximum value for the advertiser and the placer.
  • 42. Example 5 - Healthcare Informatics Problem: Physicians need better electronic records for managing patient data on a global basis and match symptoms, causes, treatments and interdependencies to improve diagnoses and outcomes. • Solution: Create a database capable of leveraging existing architecture using NOSQL tools such as Objectivity/DB and InfiniteGraph that can handle data capture, symptoms, diagnoses, treatments, reactions to medications, interactions and progress. • Result: It works: • Diagnosis is faster and more accurate • The knowledge base tracks similar medical cases. • Treatment success rates have improved.
  • 43. Example 6 - Big Data Analytics
  • 44. Example 7 – Visual Analytics
  • 45. Hands On With A Graph Database • We'll be using InfiniteGraph today • You'll need a Java Development environment on your machine • If you haven't downloaded InfiniteGraph already, please go to: http://goo.gl/XzJo6T [https://download.infinitegraph.com/index.aspx] • We'll be covering a HelloGraph and a more complex sample program

Editor's Notes

  • #44: By initiating a polyglot approach – One can utilize existing SQL based architecture and databases while still gaining the competitive advantage that the latest NOSQL technologies provide. One example of this Polyglot approach is shown here. The technology(ies) used would be dependent on the use case.
  • #45: Note Object Oriented Databases as NOSQL here.