Our Journey to Self-Discovery: Synthesizing 30 Years of Data Pipelines with Knowledge Graphs

Journey to Self-Discovery: Synthesizing 30
Years of Data Pipelines with Knowledge Graphs
Mayank Gupta
Senior VP of Technology, LPL Financial

My History with Knowledge Graphs
• Graph Adjacent for data distribution at Morgan Stanley
• UBS:
• Data Virtualization Layer – using Neo4J
• Roles based access control – using Neo4J
• LPL:
• Data Management: Account -> Client -> Household – Using Neo4J
• Financial concepts and help content driven Knowledge Graph to improve the efficacy
of home office and advisor search results – using GraphAware’s Hume and Neo4J
• Using graphs to describe complex business organizations and relationships – drive
improved engagement with our clients

Problem Statement
We are experiencing a sustained increase in
transaction volumes and our business - practices,
advisors, accounts, assets under management - is
growing rapidly
This is driving a need to increase throughput,
resiliency and scale in our data pipelines
We also want to improve the value, quality and
experiences that data enables to our user cohorts
while operating more efficiently

High Level Anatomy of Data Pipelines
Sources
of Signal
Integration
Pipes
Raw Zone Map to Internal
Logical Model
Mastering into
Systems of Record
Readying for
Distribution
Consumers

Data Pipelines – Knowledge Graph
Knowledge Types
• Signal Sources
• Physical Data description
• Physical Plant Description
• Logical Models / Concepts
• Processing Details
Source or Approach to get Knowledge
• Contracts, Integration Objects,
Configurations, Job Definitions
• File Layouts, Physical DB Schema,
Message Models/Schema, Raw Zone
scans
• ITIL CMDBs, Asset inventories
• Enterprise Vocabularies, Metadata
Repositories, Public Concept Sources
• Code Scanners, ELT/ETL
Configurations, Rules Bases, Manual

Benefits for the
Data Function
• Enables decision making for our journey to
modernize
• Allows us to discover duplications and
inconsistencies and optimize
• Allows us to better engage data providers and
users – driving to well aligned outcomes – and
making them a part of the data pipelines
• A boon to better operations, enabling problem
avoidance and faster resolution
• Enables faster time to market and fosters the
spirit of continuous improvement

Hypothesized Benefits
for the Enterprise
Beyond the basics of – better data, better decisions,
better business outcomes
• This knowledge graph – if populated end to end –
moves us closer to providing intent based
engagement with data.
• The data consumer can move to a more declarative
style of engaging with data vs. the imperative
approaches that are available today
• A comprehensive view of the information
landscape enables better risk preparation and
investment planning
• Data moves to information and ushers in a
knowledge driven enterprise

Our Journey to Self-Discovery: Synthesizing 30 Years of Data Pipelines with Knowledge Graphs

More Related Content

What's hot

Similar to Our Journey to Self-Discovery: Synthesizing 30 Years of Data Pipelines with Knowledge Graphs

More from Neo4j

Recently uploaded

Our Journey to Self-Discovery: Synthesizing 30 Years of Data Pipelines with Knowledge Graphs