SlideShare a Scribd company logo
Public
11th May 2017
Syed Haniff
Creating a Data Distribution
Knowledge Base using Neo4j
Using graph technologies to map and manage data flows within the Bank
1
 Reference data at UBS
 Building an integrated data distribution platform
 Creating a Knowledge Base using Neo4j
Overview
2
 Founded 1854
 Headquarters: Zurich, Switzerland
 Operates in 50+ countries
 Around 60,000 employees
 6 Businesses
– Wealth Management
– Wealth Management Americas
– Personal & Corporate Banking
– Asset Management
– Investment Bank
– Corporate Centre
About UBS
3
GDS manages the mastering and distribution of reference data to consumers
within the Bank.
About Group Data Services
4
 Externally and internally sourced non-transactional data:
Reference Data at UBS
Account Book Calendar Client
Confirms
Financial
Instrument
Legal Entity
Group
Dictionary
Prices Product
Trading
Agreement
Settlement
Instruction
Account Book Calendar Client
Legal Entity
5
12 Data Domains
18 Datasets
7 Distribution Channels
400+ Integrations
000s Attributes
Group Data Services in Numbers
6
Providing timely, accurate, and complete reference data to users, systems, and
processes through a number of channels.
Reference Data Distribution
7
 Masters send normalized, canonical
datasets.
 Consumers land and join datasets
themselves
 Good for producers (master data
sources) … Not so good for consumers
FeaturesOverview
Data Distribution – Previously
8
Example – Consumer joins
Consumers store multiple messages from multiple domains and resolve joins
themselves
9
Driver Situation Impact
Simplification Multiple components doing the same
/ similar tasks.
Cost+
Complexity+
Risk Reduction Consumers have to store and join
reference data
Data Staleness+
Potential for errors+
Efficiency Consumers have to receive updates
where they are not interested
Storage volumes+
Processing volumes+
Business Drivers for Change
10
 Single platform consuming
data from masters
 Platform integrates datasets
 Custom or normalized
datasets sent via
standardized channels
FeaturesOverview
Distribution Platform – Blueprint
11
Example – Platform joins
Data joined at source and available for multiple consumers – simplifies consumption
12
 Single Platform
 Pre-joined datasets
 Flexible subscription to attributes
 More consumer-oriented …
But there are still things we'd like to know …
Platform Benefits
13
What datasets and
attributes do we
provide?
Data Distribution – Questions
14
What datasets and
attributes do we
provide?
How are the
different datasets
related?
Data Distribution – Questions
15
What datasets and
attributes do we
provide?
How are the
different datasets
related?
How are users
receiving our data?
16
What datasets and
attributes do we
provide?
How are the
different datasets
related?
How are users
receiving our data?
Which consumers
are using which
attributes?
Data Distribution – Questions
17
What datasets and
attributes do we
provide?
How are the
different datasets
related?
How are users
receiving our data?
Which consumers
are using which
attributes?
Knowledge
Base
Data Distribution – Questions
18
A system component that lets us describe the journey of the
datasets and attributes from master systems to consumers
What is the Knowledge Base?
19
Building the Knowledge Base – Example Model
20
 Initially, platform (not human) requirements
 XLS + custom DSL (Domain Specific Language)
 E.g. composite INSTRUMENT dataset
– BOND_BONDRATING, EQUITY_EQUITYRATING
,  union between two data sets
_ join between two datasets
  Innovative and allowed us to build platform
  Limited, Complex, Inflexible
Physical Model – 1.0
21
Can it answer our questions …?
22
 Challenging making a relational model that answers all the (diverse) questions
 Lots of different entities …
 Lots of different relationships …
 Not all data flows are the same …
 Tough to get performance needed with a generic relational model
… Not really or easily anyway
23
The "Eureka!" moment …
Looks like a graph …
maybe we should store
as a graph(!)
24
 Store the metamodel in a graph database
 Neo4j
– Used in the Bank
– Mature
– Comprehensive resources online
– Drivers / Adapters matching language choices
Physical Model – 2.0
25
Example Dataset – Equity
26
Answers to the questions …
What datasets
and attributes
do we provide?
MATCH
(d:Dataset)-[:OWNS]->(a:PhysicalAttribute)
RETURN d, a;
CYPHER QUERY
27
Answers to the questions …
How are the
different
datasets
related?
MATCH
(d1:Dataset)<-[:JOINS]-(j:JoinRelation),
(d2:Dataset)<-[:JOINS]-(j)
RETURN d1,j,d2;
CYPHER QUERY
28
Answers to the questions …
How are users
receiving our
data
MATCH (c:Consumer)-
[:RECEIVES_VIA|:INTERESTED_IN]->(v)
RETURN c, v
CYPHER QUERY
29
Answers to the questions …
Which
consumers are
using which
attributes?
MATCH (c:Consumer)-[:INTERESTED_IN]->(view:Dataset),
(view)-[:SELECTS]->(output:Dataset),
(output)<-[:TARGET_OF]-(aggregation:Transformer)-
[:SOURCE_OF]->(aggregate:Dataset),
(aggregate)-[:OWNS]->(parts:Dataset),
(parts)-[:OWNS]->(a:PhysicalAttribute)
RETURN c, view, output, aggregation, aggregate,
parts, a
CYPHER QUERY
30
 Single source of truth
 Governance and lineage easier
 New insights for consumers
 New insights for producers!
Knowledge Base – Benefits
31
 Coverage – not all datasets entered yet
 Lots of data – we store source, interim, target datasets
 Concept can be a bit intangible at times
Knowledge Base - Challenges
32
 Data Distribution is a natural "flow" from one processing node to another
 Ad-hoc relationship traversal difficult in relational databases
 Flexibility essential
– New sources, datasets, consumers, rules, …
 Everything is an instance
– Model very organic by focusing on relationship between processing nodes rather than structure
How did a graph database help?
33
 Answers our questions … and more
 Flexible schema  Can model different flows
 Easy(-ish) Query Language  Cypher
 Easy to create platform service layer
 Good performance
 Good support from vendor
Neo4j – Benefits
34
 Loading data required manual work
 No out-of-the-box tools to manage the data
 Skills rare … but easy to grow
Neo4j – Challenges
35
 Focus on human interactions
– Better search
– Better visualisation
 Widen coverage of datasets
 Offer to other parts of Bank
 Impact Analysis tools
 Self-service data integration
Next steps
36
Thank you!

More Related Content

What's hot (16)

PDF
Intro to Neo4j Webinar
Neo4j
 
PPTX
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j
 
PDF
Neo4j Graph Data Science - Webinar
Neo4j
 
PDF
Translating the Human Analog to Digital with Graphs
Neo4j
 
PPTX
GraphTour - Neo4j Platform Overview
Neo4j
 
PPTX
GraphTalks Rome - Selecting the right Technology
Neo4j
 
PDF
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Neo4j
 
PDF
GraphConnect Europe 2016 - Opening Keynote, Emil Eifrem
Neo4j
 
PDF
A Connections-first Approach to Supply Chain Optimization
Neo4j
 
PPTX
Network and IT Operations
Neo4j
 
PDF
Graphs in Action
Neo4j
 
PDF
Neanex - Semantic Construction with Graphs
Neo4j
 
PDF
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
Neo4j
 
PPTX
Beyond the Data Lake - Matthias Korn, Technical Consultant at Data Virtuality
Dataconomy Media
 
PPTX
GraphTalks - Einführung
Neo4j
 
PDF
Neo4j PartnerDay Amsterdam 2017
Neo4j
 
Intro to Neo4j Webinar
Neo4j
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j
 
Neo4j Graph Data Science - Webinar
Neo4j
 
Translating the Human Analog to Digital with Graphs
Neo4j
 
GraphTour - Neo4j Platform Overview
Neo4j
 
GraphTalks Rome - Selecting the right Technology
Neo4j
 
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Neo4j
 
GraphConnect Europe 2016 - Opening Keynote, Emil Eifrem
Neo4j
 
A Connections-first Approach to Supply Chain Optimization
Neo4j
 
Network and IT Operations
Neo4j
 
Graphs in Action
Neo4j
 
Neanex - Semantic Construction with Graphs
Neo4j
 
GraphTalk Copenhagen - Killing Data Silos in the Life Sciences with Neo4j
Neo4j
 
Beyond the Data Lake - Matthias Korn, Technical Consultant at Data Virtuality
Dataconomy Media
 
GraphTalks - Einführung
Neo4j
 
Neo4j PartnerDay Amsterdam 2017
Neo4j
 

Similar to Creating a Data Distribution Knowledge Base using Neo4j, UBS (20)

PDF
GraphTour 2020 - Neo4j: What's New?
Neo4j
 
PDF
Markdown Monster 3.6.9 Free crack Download
mohsinraza05mb
 
PDF
Aiseesoft Video Converter Ultimate 10.9.6
blouch119kp
 
PPTX
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
PDF
Adobe Premiere Pro 2025 crack (v25.2.1.002)
blouch133kp
 
PDF
Vivaldi Web Browser 6.8.3381.50 Crack Free
alihamzakpa071
 
PDF
Neo4j Introduction Workshop for Partners
Deepak Sood
 
PDF
Pazu Netflix Video Downloader Download
mohsinraza05mb
 
PDF
Atlantis Word Processor 4.4.5.1 Free Download
blouch120kp
 
PPTX
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
PDF
Adobe Substance 3D Designer 14.1.2.8986
blouch133kp
 
PDF
Pazu Netflix Video Downloader 1.7.3 Crack Free
alihamzakpa071
 
PDF
Autodesk Netfabb Ultimate 2025 free crack
blouch110kp
 
PDF
Office(R)Tool Download crack (Latest 2025)
blouch120kp
 
PDF
GRAPHISOFT ArchiCAD 28.1.1.4100 free crack
blouch136kp
 
PDF
Auslogics Video Grabber Free 1.0.0.12 Free
blouch134kp
 
PPTX
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
Neo4j
 
PDF
Itop vpn crack Latest Version 2025 FREE Download
mahnoorwaqar444
 
PDF
FL Studio Crack FREE Download link 2025 NEW Version
mahnoorwaqar444
 
PDF
IM in Graph 2017-05.pdf
Michal Miklas
 
GraphTour 2020 - Neo4j: What's New?
Neo4j
 
Markdown Monster 3.6.9 Free crack Download
mohsinraza05mb
 
Aiseesoft Video Converter Ultimate 10.9.6
blouch119kp
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Adobe Premiere Pro 2025 crack (v25.2.1.002)
blouch133kp
 
Vivaldi Web Browser 6.8.3381.50 Crack Free
alihamzakpa071
 
Neo4j Introduction Workshop for Partners
Deepak Sood
 
Pazu Netflix Video Downloader Download
mohsinraza05mb
 
Atlantis Word Processor 4.4.5.1 Free Download
blouch120kp
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Adobe Substance 3D Designer 14.1.2.8986
blouch133kp
 
Pazu Netflix Video Downloader 1.7.3 Crack Free
alihamzakpa071
 
Autodesk Netfabb Ultimate 2025 free crack
blouch110kp
 
Office(R)Tool Download crack (Latest 2025)
blouch120kp
 
GRAPHISOFT ArchiCAD 28.1.1.4100 free crack
blouch136kp
 
Auslogics Video Grabber Free 1.0.0.12 Free
blouch134kp
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
Neo4j
 
Itop vpn crack Latest Version 2025 FREE Download
mahnoorwaqar444
 
FL Studio Crack FREE Download link 2025 NEW Version
mahnoorwaqar444
 
IM in Graph 2017-05.pdf
Michal Miklas
 
Ad

More from Neo4j (20)

PDF
GraphSummit Singapore Master Deck - May 20, 2025
Neo4j
 
PPTX
Graphs & GraphRAG - Essential Ingredients for GenAI
Neo4j
 
PPTX
Neo4j Knowledge for Customer Experience.pptx
Neo4j
 
PPTX
GraphTalk New Zealand - The Art of The Possible.pptx
Neo4j
 
PDF
Neo4j: The Art of the Possible with Graph
Neo4j
 
PDF
Smarter Knowledge Graphs For Public Sector
Neo4j
 
PDF
GraphRAG and Knowledge Graphs Exploring AI's Future
Neo4j
 
PDF
Matinée GenAI & GraphRAG Paris - Décembre 24
Neo4j
 
PDF
ANZ Presentation: GraphSummit Melbourne 2024
Neo4j
 
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Neo4j
 
PDF
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
Neo4j
 
PDF
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
Neo4j
 
PDF
Démonstration Digital Twin Building Wire Management
Neo4j
 
PDF
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
Neo4j
 
PDF
Démonstration Supply Chain - GraphTalk Paris
Neo4j
 
PDF
The Art of Possible - GraphTalk Paris Opening Session
Neo4j
 
PPTX
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
Neo4j
 
PDF
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
Neo4j
 
PDF
Neo4j Graph Data Modelling Session - GraphTalk
Neo4j
 
PDF
Neo4j: The Art of Possible with Graph Technology
Neo4j
 
GraphSummit Singapore Master Deck - May 20, 2025
Neo4j
 
Graphs & GraphRAG - Essential Ingredients for GenAI
Neo4j
 
Neo4j Knowledge for Customer Experience.pptx
Neo4j
 
GraphTalk New Zealand - The Art of The Possible.pptx
Neo4j
 
Neo4j: The Art of the Possible with Graph
Neo4j
 
Smarter Knowledge Graphs For Public Sector
Neo4j
 
GraphRAG and Knowledge Graphs Exploring AI's Future
Neo4j
 
Matinée GenAI & GraphRAG Paris - Décembre 24
Neo4j
 
ANZ Presentation: GraphSummit Melbourne 2024
Neo4j
 
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Neo4j
 
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
Neo4j
 
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
Neo4j
 
Démonstration Digital Twin Building Wire Management
Neo4j
 
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
Neo4j
 
Démonstration Supply Chain - GraphTalk Paris
Neo4j
 
The Art of Possible - GraphTalk Paris Opening Session
Neo4j
 
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
Neo4j
 
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
Neo4j
 
Neo4j Graph Data Modelling Session - GraphTalk
Neo4j
 
Neo4j: The Art of Possible with Graph Technology
Neo4j
 
Ad

Recently uploaded (20)

PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
CloudStack GPU Integration - Rohit Yadav
ShapeBlue
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PPTX
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
PDF
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
CloudStack GPU Integration - Rohit Yadav
ShapeBlue
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 

Creating a Data Distribution Knowledge Base using Neo4j, UBS

  • 1. Public 11th May 2017 Syed Haniff Creating a Data Distribution Knowledge Base using Neo4j Using graph technologies to map and manage data flows within the Bank
  • 2. 1  Reference data at UBS  Building an integrated data distribution platform  Creating a Knowledge Base using Neo4j Overview
  • 3. 2  Founded 1854  Headquarters: Zurich, Switzerland  Operates in 50+ countries  Around 60,000 employees  6 Businesses – Wealth Management – Wealth Management Americas – Personal & Corporate Banking – Asset Management – Investment Bank – Corporate Centre About UBS
  • 4. 3 GDS manages the mastering and distribution of reference data to consumers within the Bank. About Group Data Services
  • 5. 4  Externally and internally sourced non-transactional data: Reference Data at UBS Account Book Calendar Client Confirms Financial Instrument Legal Entity Group Dictionary Prices Product Trading Agreement Settlement Instruction Account Book Calendar Client Legal Entity
  • 6. 5 12 Data Domains 18 Datasets 7 Distribution Channels 400+ Integrations 000s Attributes Group Data Services in Numbers
  • 7. 6 Providing timely, accurate, and complete reference data to users, systems, and processes through a number of channels. Reference Data Distribution
  • 8. 7  Masters send normalized, canonical datasets.  Consumers land and join datasets themselves  Good for producers (master data sources) … Not so good for consumers FeaturesOverview Data Distribution – Previously
  • 9. 8 Example – Consumer joins Consumers store multiple messages from multiple domains and resolve joins themselves
  • 10. 9 Driver Situation Impact Simplification Multiple components doing the same / similar tasks. Cost+ Complexity+ Risk Reduction Consumers have to store and join reference data Data Staleness+ Potential for errors+ Efficiency Consumers have to receive updates where they are not interested Storage volumes+ Processing volumes+ Business Drivers for Change
  • 11. 10  Single platform consuming data from masters  Platform integrates datasets  Custom or normalized datasets sent via standardized channels FeaturesOverview Distribution Platform – Blueprint
  • 12. 11 Example – Platform joins Data joined at source and available for multiple consumers – simplifies consumption
  • 13. 12  Single Platform  Pre-joined datasets  Flexible subscription to attributes  More consumer-oriented … But there are still things we'd like to know … Platform Benefits
  • 14. 13 What datasets and attributes do we provide? Data Distribution – Questions
  • 15. 14 What datasets and attributes do we provide? How are the different datasets related? Data Distribution – Questions
  • 16. 15 What datasets and attributes do we provide? How are the different datasets related? How are users receiving our data?
  • 17. 16 What datasets and attributes do we provide? How are the different datasets related? How are users receiving our data? Which consumers are using which attributes? Data Distribution – Questions
  • 18. 17 What datasets and attributes do we provide? How are the different datasets related? How are users receiving our data? Which consumers are using which attributes? Knowledge Base Data Distribution – Questions
  • 19. 18 A system component that lets us describe the journey of the datasets and attributes from master systems to consumers What is the Knowledge Base?
  • 20. 19 Building the Knowledge Base – Example Model
  • 21. 20  Initially, platform (not human) requirements  XLS + custom DSL (Domain Specific Language)  E.g. composite INSTRUMENT dataset – BOND_BONDRATING, EQUITY_EQUITYRATING ,  union between two data sets _ join between two datasets   Innovative and allowed us to build platform   Limited, Complex, Inflexible Physical Model – 1.0
  • 22. 21 Can it answer our questions …?
  • 23. 22  Challenging making a relational model that answers all the (diverse) questions  Lots of different entities …  Lots of different relationships …  Not all data flows are the same …  Tough to get performance needed with a generic relational model … Not really or easily anyway
  • 24. 23 The "Eureka!" moment … Looks like a graph … maybe we should store as a graph(!)
  • 25. 24  Store the metamodel in a graph database  Neo4j – Used in the Bank – Mature – Comprehensive resources online – Drivers / Adapters matching language choices Physical Model – 2.0
  • 27. 26 Answers to the questions … What datasets and attributes do we provide? MATCH (d:Dataset)-[:OWNS]->(a:PhysicalAttribute) RETURN d, a; CYPHER QUERY
  • 28. 27 Answers to the questions … How are the different datasets related? MATCH (d1:Dataset)<-[:JOINS]-(j:JoinRelation), (d2:Dataset)<-[:JOINS]-(j) RETURN d1,j,d2; CYPHER QUERY
  • 29. 28 Answers to the questions … How are users receiving our data MATCH (c:Consumer)- [:RECEIVES_VIA|:INTERESTED_IN]->(v) RETURN c, v CYPHER QUERY
  • 30. 29 Answers to the questions … Which consumers are using which attributes? MATCH (c:Consumer)-[:INTERESTED_IN]->(view:Dataset), (view)-[:SELECTS]->(output:Dataset), (output)<-[:TARGET_OF]-(aggregation:Transformer)- [:SOURCE_OF]->(aggregate:Dataset), (aggregate)-[:OWNS]->(parts:Dataset), (parts)-[:OWNS]->(a:PhysicalAttribute) RETURN c, view, output, aggregation, aggregate, parts, a CYPHER QUERY
  • 31. 30  Single source of truth  Governance and lineage easier  New insights for consumers  New insights for producers! Knowledge Base – Benefits
  • 32. 31  Coverage – not all datasets entered yet  Lots of data – we store source, interim, target datasets  Concept can be a bit intangible at times Knowledge Base - Challenges
  • 33. 32  Data Distribution is a natural "flow" from one processing node to another  Ad-hoc relationship traversal difficult in relational databases  Flexibility essential – New sources, datasets, consumers, rules, …  Everything is an instance – Model very organic by focusing on relationship between processing nodes rather than structure How did a graph database help?
  • 34. 33  Answers our questions … and more  Flexible schema  Can model different flows  Easy(-ish) Query Language  Cypher  Easy to create platform service layer  Good performance  Good support from vendor Neo4j – Benefits
  • 35. 34  Loading data required manual work  No out-of-the-box tools to manage the data  Skills rare … but easy to grow Neo4j – Challenges
  • 36. 35  Focus on human interactions – Better search – Better visualisation  Widen coverage of datasets  Offer to other parts of Bank  Impact Analysis tools  Self-service data integration Next steps