SlideShare a Scribd company logo
data.world
How to launch a data catalog in minutes
Tim Gasper
VP of Product
data.world
Five things to consider about Data Mesh and Data Governance
Paul Gancz
Partner Solutions Architect
Snowflake
Juan Sequeda
Principal Scientist
data.world
datadotworld data.world
Better together
The Data Cloud
ONE platform
MANY workloads
NO data silos
The most powerful
combined data mesh
solution to eliminate data
silos and democratize
access to well-governed
data products.
The Modern Data Catalog
Make data discovery,
governance, and analysis
easy.
+
Why Data Mesh?
What is the problem?
Monolithic approaches to data
don’t scale socially
Data is treated as an afterthought
Why do we care?
Centralized processes and teams become a
bottleneck for the business
Data value is being left untapped
Distribute responsibility for data pipelines and data quality to people with domain knowledge.
Serve data as a product using a common self-service IT infrastructure platform.
Domain-Centric
Ownership &
Architecture
Data as-a-Product Self-Service
Data Platform
Federated
Governance
Data pipelines owned by
teams with domain knowledge
Domains own cleansing,
refinement, historization,
pre-aggregation, etc.
Domains responsible for
governance, lineage, etc.
Domains treat data with
consumers in mind
Data is discoverable
Data is easy to obtain and use
Data is documented
Domains responsible for the
quality of their data
Common set of tools across
domains
Domain-agnostic
Easy to use and low
maintenance to support
Easy to deploy repeatable
patterns for data cleansing,
transformation, automation,
storage, security, governance,
sharing
Global interoperability
standards across domains
Define and use global data
governance policies
Define and apply governance
within each domain and
propagate downstream
Data Mesh Principles
Source: Zhamak Dehghani, https://martinfowler.com/articles/data-monolith-to-mesh.html , https://martinfowler.com/articles/data-mesh-principles.html
DATA GOVERNANCE CHALLENGES
5
Data Is
Everywhere
Must be able to eliminate
silos inside and outside
your organization
Managing Data Is
Unnecessarily Complex
Knowing what your
data is — and how it is
being used — is hard
Security and Governance
Are Inherently Rigid
Requires managing risk and
changing regulations, while
getting the most from your data
DATA GOVERNANCE IN THE DATA CLOUD
6
Know Your Data Protect Your Data
Understand, classify, and
track data and its usage
Secure sensitive data with
policy-based access controls
Securely collaborate and
share data across teams
Unlock Your Data
What is has been...
Risk avoidance and compliance
Top-down policies
Cumbersome processes
DATA GOVERNANCE
What it needs to be...
DATA GOVERNANCE
Rules of cooperation and collaboration
Process of data & analytics together
Capture knowledge in real-time
What is the goal of data governance
Data Governance and Data Catalogs
What do catalogs do and how they help
Governance is now about data discoverability; not just data
protection.
While application silos pose a governance challenge, inclusive,
agile data governance approaches pose solutions.
Governance needs to be a benefit, not a burden. The friction
has to go away.
Business users don’t want to install software for governance,
SaaS removes all the friction and is the way to go.
Understand and trust your data with profiling, sampling and
lineage.
Everyone (producers and consumers) actively contributes to
data as they use it.
Accelerates time to value and uncover insights.
Cloud-native and multi-tenant approach are highly available,
scale bigger, perform better and evolve faster.
1
2
3
4
5
Five things to consider about Data Mesh and Data Governance
What is the scope?
Who are the stakeholders?
Where should we standardize and productize data?
Who is responsible?
How to be agile?
1. What is the scope?
Example Architecture
Data
sources
Consumers
ELT ELT
ETL ETL ETL
Data
Model
Data
Model
ETL ETL
ETL ETL
ETL ETL
ETL
datadotworld data.world
Source: Zhamak Dehghani - martinfowler.com
What are your domains?
Data Mesh: Domain-centric Architecture
Domain: Customer
Data
sources
from
different
domains
Consumers
Domain: Helpdesk & Support
Domain: Products
Interoperability Standards, Federated Governance, Data Catalog
ELT ELT
ETL ETL ETL
Data
Model
Data
Model
ETL ETL
ETL ETL
ETL ETL
ETL
Domain: Orders & Sales
Domain:
Marketing & Promotions
Domain: Customer 360
• Domain-centric ownership of data sources, pipelines, and data quality
• Ownership sits with domain knowledge 🡪 better data quality for consumers
• Domain teams can react faster to source format changes or quality issues
• Overall easier to scale the number of sources & consumers
• Data assets offered as products
• “Serve & pull” instead of
“push & ingest” model
datadotworld data.world
Resource Graph
Data Platform
Catalog
How scope affects your data catalog
Analytics
Catalog
Approach Purpose Coverage Stakeholders
Analytics Catalog
Enabling Data
Consumers discover
assets
Data Lake and Data
Mart Tables and related
Reports
Analysts, BI Team,
Report Writers,
Report Users
Data Platform
Catalog
+
Enabling the
management of Data
Platform (automation
and observability)
+
Upstream Data sources,
lineage, streaming data,
ml model, usage
information
+
Data Scientist, Data
Engineers
Enterprise
Resource Graph
+
Managing and protect
the company’s data
related resources
+
All data systems,
services, classification,
access and provenance.
+
Run Time
Developers,
Security, Privacy
The approach to managing metadata will depend on the problems that are a priority to solve.
2. Who are the stakeholders?
datadotworld data.world
Capture and store what user
data exists, where is it, and
who is responsible for it?
Privacy
Tell me where is the sensitive
data, how is it handled, who
has access, who is
responsible for it?
Provide a platform to store
and share data best
practices, certifications,
documentation, and curated
data models.
Tell me what data there is, its
usability, how to use it and
who to go to for help.
Tell me who uses my data,
and give me a platform to
interact with them.
Enable automation within
data systems – registration,
provisioning, validation,
access controls, etc.
Stakeholders
Key to buy-in, executive sponsorship, and oversight.
Security Platforms
Data Governance
Data Producers
Data Consumers
Data Leadership
3. Where should we standardize and productize data?
datadotworld data.world
What is a Data Product?
“A product that facilitates an end goal
through the use of data”
DJ Patil, former United States Chief Data Scientist
“Data as a product defines a new
concept, called data product that
embodies standardized characteristics
to make data valuable and usable.”
Zhamak Dehghani, Thoughtworks Director of Emerging
Technologies and founder of data mesh
datadotworld data.world
Data Product ABCs
Explicit Knowledge
E
● Modeling Schemas
● Documentation
● Relationships with other Data Products
Downstream Consumers
D
● Current and Potential Consumers
● Use Cases
● Roadmap
Contracts & Expectations
C
● Data Constraints, Definitions, Tests
● SLAs, SLOs, Sharing Agreements, Consents, Purposes
● Performance, Scale, Maintainability, etc.
Boundaries
B
● What is it? What isn’t it?
● Where will it live?
● Inputs and Outputs
Accountability
A
● Who is the owner?
● Who defines the requirements?
● Who fixes it when it breaks?
datadotworld data.world
What is a Data Product?
Data Producer A
Internal Data
API
Data Product(s)
Data Consumer B
Data Consumer A
Data Platform
Dataset
The Cloud-Native Data Catalog
datadotworld data.world
What is a Data Product?
Data Producer
A
Internal Data
API
Data Product(s)
Data Producer
B
Internal Data
API
Data Producer
C
Internal Data
API
Data
Consumer C
Data
Consumer B
Data
Consumer A
Data Platform
Aggregate or “Enterprise”
Data Product(s)
Data Mesh Reference Architecture
Domain: Customer
Domain: Sales
Domain: Products
Domain: Marketing
Domain: Customer 360
Inventory of shared
data products
Snowflake
Reader Account
Snowflake Data Cloud
Consumers
Interoperability Standards, Federated Governance, 3rd
Party Tools
Snowflake Data Sharing as the preferred interoperability standard. Data Marketplace makes data discoverable.
Data Exchange / Catalog for
Consumers
• Connects providers to consumers
• Inventory of available assets
• No central storage of shared data
• Providers retain full control over shared
assets (data, functions)
• Consumers access live provider data, no
copies or ETL required. Register shared
data for local SQL access in their
environment (no copy)
Data domains:
• Can consume and share data or
functions
• Control access policies, data masking,
etc. for downstream consumers
• Can share external tables, i.e. provide
access to data outside of Snowflake
• Can provide reader accounts for
non-Snowflake consumers
Data Catalog for Producers:
• Technical Metadata Inventory, Lineage,
Sensitive Data, Business Glossary
3rd
party
marketing
agency
Reseller
Sales
Analysts
Churn &
Retention
Business
optimization
Finance &
Controlling
Data Sources
Global and Multi-Cloud Data Mesh
Data Domain 1
Data Domain 2
Data Domain 3
Data Domain 5
Data Domain 4
Interoperability Standards, Federated Governance, 3rd
Party Tools
US East
FRA
Snowflake
Reader Account
Consumers
Snowflake enables a truly global and multi-cloud data mesh across cloud platforms and regions.
• Data sources, data domains, and
consumers can sit in different regions
and different cloud platforms
• Snowflake enables a truly global and
multi-cloud data mesh
Tokyo
Zurich
Snowflake Data Cloud
Data Sources
Inventory of shared
data products
GOVERNANCE IN THE DATA CLOUD
Know, protect, and unlock your data
Know your data Protect your data Unlock your data
Object Tagging
Auto Classification
Object
Dependencies**
Access History
(writes)**
Access History
(data access audit)
What
Where
Who
Row Access Policies
Dynamic Data Masking
External Tokenization
Conditional Masking
Secure Data Sharing
Data Exchange
Data Marketplace
Object
Dependencies
(impact analysis)
Access History
(data lineage)
4. Who is responsible?
datadotworld data.world
Who is responsible?
Whether you call them data product managers, data stewards, data owners, data
advocates, data custodians, or data trustees…
Let’s revisit Accountability of the Data Product ABCs Framework:
● Who is the owner?
● Who defines the requirements?
● Who fixes it when it breaks?
● Who defines the roadmap?
● Who has the expertise?
What are the fewest number of critical “hats to wear”?
datadotworld data.world
Data Producer Data Consumer
Data Platform
Data Engineering
Data Producer Data Consumer
Data Platform
Data Management
Changing the Paradigm
Data Management as an Intermediary Direct Data Producer and Data
Consumer Collaboration
Data Mesh: Domain-centric Responsibility
Domain: Customer
Data
sources
from
different
domains
Consumers
Domain: Helpdesk & Support
Domain: Products
Interoperability Standards, Federated Governance, Data Catalog
ELT ELT
ETL ETL ETL
Data
Model
Data
Model
ETL ETL
ETL ETL
ETL ETL
ETL
Domain: Orders & Sales
Domain:
Marketing & Promotions
Domain: Customer 360
Data
Consumption
Data
Management
Data
Integration
Data Sources
5. How to be agile?
datadotworld data.world
The Cloud Data Catalog
What is Agile Data Governance?
The process of creating and improving data
assets by iteratively capturing knowledge as
data producers and consumers work together
so that everyone can benefit.
Empowering the usage of data safely.
It adapts the deeply proven best practices of
Agile and Open software development to data
and analytics.
datadotworld data.world
The Cloud-Native Data Catalog
datadotworld data.world
Agile Data Governance Process: iterate!
datadotworld data.world
The Cloud Data Catalog datadotworld data.world
The Cloud-Native Data Catalog
datadotworld data.world
The time impact of being fast, incremental, and iterative
Define policies
Release
Refine
Build workflows
Define standards and principles
Use Case 1
Define policies
Release
Build workflows
Define standards/principles
Analysis, Insight, Value
Measure, Learn, Iterate
Use Case 2
Define policies
Release
Build workflows
Define standards/principles
Analysis, Insight, Value
Measure, Learn, Iterate
Use Case 3
Define policies
Release
Build workflows
Define standards/principles
Analysis, Insight, Value
Measure, Learn, Iterate
Use Case 4
Define policies
Release
Build workflows
Define standards/principles
Analysis, Insight, Value
Measure, Learn, Iterate
datadotworld data.world
The Cloud Data Catalog datadotworld data.world
The Cloud-Native Data Catalog
datadotworld data.world
Takeaways
What is the scope?
● Identify the Domains. You are already doing the work,
they exist!
● Depends on the problems that are a priority to solve:
Analytics, Data Platform, Enterprise Resources
Who are the stakeholders of your data catalog?
● Always need Data Leadership
● Consumers, Producers, Governance, Privacy,
Security, Platforms
Where to Standardize/Productize Data?
● Data Product ABCs: Accountability, Boundaries,
Contracts & Expectations, Downstream
Consumers, Explicit Knowledge
● Consumption, Data Mgmt, Data Producing Systems
Who is responsible?
● Accountability: Owner, Requirements, Who
Fixes, Roadmap, Expertises
● Consumption, Data Mgmt, Data Producing
Systems
How to be agile?
● Empowering the usage of data safely.
● Develop a backlog of questions based on end user
business value
● Sprints, Peer Review, Collaborate, Iterate
The Cloud-Native Data Catalog
Learn more about data mesh governance
What’s inside?
How to…
● Establish a framework for treating data as a product
● Find the right balance of decentralization and centralization
● Transform data into knowledge
Download it here:
data.world/resources/reports-and-tools/data-mesh-governance-white-paper
datadotworld data.world
The Cloud Data Catalog datadotworld data.world

More Related Content

What's hot (20)

PDF
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
PDF
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
PDF
Enterprise Architecture vs. Data Architecture
DATAVERSITY
 
PPTX
Building a modern data warehouse
James Serra
 
PDF
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
PDF
Enterprise Architecture vs. Data Architecture
DATAVERSITY
 
PDF
The ABCs of Treating Data as Product
DATAVERSITY
 
PDF
Making Data Timelier and More Reliable with Lakehouse Technology
Matei Zaharia
 
PDF
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
HostedbyConfluent
 
PPTX
How to Build & Sustain a Data Governance Operating Model
DATUM LLC
 
PDF
Data platform architecture
Sudheer Kondla
 
PDF
Mdm: why, when, how
Jean-Michel Franco
 
PDF
Collibra - Forrester Presentation : Data Governance 2.0
Guillaume LE GALIARD
 
PDF
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
PDF
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
PPTX
Databricks Fundamentals
Dalibor Wijas
 
PDF
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Tristan Baker
 
PDF
Data Governance and Metadata Management
DATAVERSITY
 
PDF
How to govern and secure a Data Mesh?
confluent
 
PPTX
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
Enterprise Architecture vs. Data Architecture
DATAVERSITY
 
Building a modern data warehouse
James Serra
 
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
Enterprise Architecture vs. Data Architecture
DATAVERSITY
 
The ABCs of Treating Data as Product
DATAVERSITY
 
Making Data Timelier and More Reliable with Lakehouse Technology
Matei Zaharia
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
HostedbyConfluent
 
How to Build & Sustain a Data Governance Operating Model
DATUM LLC
 
Data platform architecture
Sudheer Kondla
 
Mdm: why, when, how
Jean-Michel Franco
 
Collibra - Forrester Presentation : Data Governance 2.0
Guillaume LE GALIARD
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
Databricks Fundamentals
Dalibor Wijas
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Tristan Baker
 
Data Governance and Metadata Management
DATAVERSITY
 
How to govern and secure a Data Mesh?
confluent
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 

Similar to Five Things to Consider About Data Mesh and Data Governance (20)

PPTX
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Nathan Bijnens
 
PPTX
Data Mesh using Microsoft Fabric
Nathan Bijnens
 
PPTX
Data Domain-Driven Design
Kiran Kumar Chittoori
 
PDF
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
SG Analytics
 
PPTX
L300 Technical Slide Library_Feb 2025 microsoft purview
macarenabenitez6
 
PDF
data-mesh_whitepaper_dec2021.pdf
ssuser18927d
 
PPTX
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx
DataScienceConferenc1
 
PDF
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
PDF
pwc-data-mesh.pdf
ssuser18927d
 
PDF
Data Con LA 2022 - Self-Service Success and Data Products
Data Con LA
 
PDF
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
PDF
Why Data Mesh Needs Data Virtualization (ASEAN)
Denodo
 
PPTX
DataPlatform.pptx
RahulGupta417334
 
PDF
The Journey to Data Mesh with Confluent
confluent
 
PDF
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 
PDF
Data Mesh Delivering Datadriven Value At Scale 3rd Edition Zhamak Dehghani
leaxhomid
 
PDF
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
PDF
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering
 
PDF
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Denodo
 
PDF
Data Mesh in Action (MEAP V04) Jacek Majchrzak
nakishouke2w
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Nathan Bijnens
 
Data Mesh using Microsoft Fabric
Nathan Bijnens
 
Data Domain-Driven Design
Kiran Kumar Chittoori
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
SG Analytics
 
L300 Technical Slide Library_Feb 2025 microsoft purview
macarenabenitez6
 
data-mesh_whitepaper_dec2021.pdf
ssuser18927d
 
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx
DataScienceConferenc1
 
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
pwc-data-mesh.pdf
ssuser18927d
 
Data Con LA 2022 - Self-Service Success and Data Products
Data Con LA
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
Why Data Mesh Needs Data Virtualization (ASEAN)
Denodo
 
DataPlatform.pptx
RahulGupta417334
 
The Journey to Data Mesh with Confluent
confluent
 
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 
Data Mesh Delivering Datadriven Value At Scale 3rd Edition Zhamak Dehghani
leaxhomid
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering
 
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Denodo
 
Data Mesh in Action (MEAP V04) Jacek Majchrzak
nakishouke2w
 
Ad

More from DATAVERSITY (20)

PDF
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
 
PDF
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
PDF
Exploring Levels of Data Literacy
DATAVERSITY
 
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
PDF
Make Data Work for You
DATAVERSITY
 
PDF
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
 
PDF
Data Modeling Fundamentals
DATAVERSITY
 
PDF
Showing ROI for Your Analytic Project
DATAVERSITY
 
PDF
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
PDF
Is Enterprise Data Literacy Possible?
DATAVERSITY
 
PDF
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
PDF
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
PDF
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
 
PDF
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
PDF
2023 Trends in Enterprise Analytics
DATAVERSITY
 
PDF
Data Strategy Best Practices
DATAVERSITY
 
PDF
Who Should Own Data Governance – IT or Business?
DATAVERSITY
 
PDF
Data Management Best Practices
DATAVERSITY
 
PDF
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
PDF
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
DATAVERSITY
 
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
Exploring Levels of Data Literacy
DATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Make Data Work for You
DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
 
Data Modeling Fundamentals
DATAVERSITY
 
Showing ROI for Your Analytic Project
DATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Is Enterprise Data Literacy Possible?
DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
2023 Trends in Enterprise Analytics
DATAVERSITY
 
Data Strategy Best Practices
DATAVERSITY
 
Who Should Own Data Governance – IT or Business?
DATAVERSITY
 
Data Management Best Practices
DATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
DATAVERSITY
 
Ad

Recently uploaded (20)

PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PDF
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPTX
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
PPT
deep dive data management sharepoint apps.ppt
novaprofk
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PDF
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PDF
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
PPTX
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
Data base management system Transactions.ppt
gandhamcharan2006
 
deep dive data management sharepoint apps.ppt
novaprofk
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 

Five Things to Consider About Data Mesh and Data Governance

  • 1. data.world How to launch a data catalog in minutes Tim Gasper VP of Product data.world Five things to consider about Data Mesh and Data Governance Paul Gancz Partner Solutions Architect Snowflake Juan Sequeda Principal Scientist data.world
  • 2. datadotworld data.world Better together The Data Cloud ONE platform MANY workloads NO data silos The most powerful combined data mesh solution to eliminate data silos and democratize access to well-governed data products. The Modern Data Catalog Make data discovery, governance, and analysis easy. +
  • 3. Why Data Mesh? What is the problem? Monolithic approaches to data don’t scale socially Data is treated as an afterthought Why do we care? Centralized processes and teams become a bottleneck for the business Data value is being left untapped
  • 4. Distribute responsibility for data pipelines and data quality to people with domain knowledge. Serve data as a product using a common self-service IT infrastructure platform. Domain-Centric Ownership & Architecture Data as-a-Product Self-Service Data Platform Federated Governance Data pipelines owned by teams with domain knowledge Domains own cleansing, refinement, historization, pre-aggregation, etc. Domains responsible for governance, lineage, etc. Domains treat data with consumers in mind Data is discoverable Data is easy to obtain and use Data is documented Domains responsible for the quality of their data Common set of tools across domains Domain-agnostic Easy to use and low maintenance to support Easy to deploy repeatable patterns for data cleansing, transformation, automation, storage, security, governance, sharing Global interoperability standards across domains Define and use global data governance policies Define and apply governance within each domain and propagate downstream Data Mesh Principles Source: Zhamak Dehghani, https://martinfowler.com/articles/data-monolith-to-mesh.html , https://martinfowler.com/articles/data-mesh-principles.html
  • 5. DATA GOVERNANCE CHALLENGES 5 Data Is Everywhere Must be able to eliminate silos inside and outside your organization Managing Data Is Unnecessarily Complex Knowing what your data is — and how it is being used — is hard Security and Governance Are Inherently Rigid Requires managing risk and changing regulations, while getting the most from your data
  • 6. DATA GOVERNANCE IN THE DATA CLOUD 6 Know Your Data Protect Your Data Understand, classify, and track data and its usage Secure sensitive data with policy-based access controls Securely collaborate and share data across teams Unlock Your Data
  • 7. What is has been... Risk avoidance and compliance Top-down policies Cumbersome processes DATA GOVERNANCE
  • 8. What it needs to be... DATA GOVERNANCE Rules of cooperation and collaboration Process of data & analytics together Capture knowledge in real-time
  • 9. What is the goal of data governance Data Governance and Data Catalogs What do catalogs do and how they help Governance is now about data discoverability; not just data protection. While application silos pose a governance challenge, inclusive, agile data governance approaches pose solutions. Governance needs to be a benefit, not a burden. The friction has to go away. Business users don’t want to install software for governance, SaaS removes all the friction and is the way to go. Understand and trust your data with profiling, sampling and lineage. Everyone (producers and consumers) actively contributes to data as they use it. Accelerates time to value and uncover insights. Cloud-native and multi-tenant approach are highly available, scale bigger, perform better and evolve faster.
  • 10. 1 2 3 4 5 Five things to consider about Data Mesh and Data Governance What is the scope? Who are the stakeholders? Where should we standardize and productize data? Who is responsible? How to be agile?
  • 11. 1. What is the scope?
  • 12. Example Architecture Data sources Consumers ELT ELT ETL ETL ETL Data Model Data Model ETL ETL ETL ETL ETL ETL ETL
  • 13. datadotworld data.world Source: Zhamak Dehghani - martinfowler.com What are your domains?
  • 14. Data Mesh: Domain-centric Architecture Domain: Customer Data sources from different domains Consumers Domain: Helpdesk & Support Domain: Products Interoperability Standards, Federated Governance, Data Catalog ELT ELT ETL ETL ETL Data Model Data Model ETL ETL ETL ETL ETL ETL ETL Domain: Orders & Sales Domain: Marketing & Promotions Domain: Customer 360 • Domain-centric ownership of data sources, pipelines, and data quality • Ownership sits with domain knowledge 🡪 better data quality for consumers • Domain teams can react faster to source format changes or quality issues • Overall easier to scale the number of sources & consumers • Data assets offered as products • “Serve & pull” instead of “push & ingest” model
  • 15. datadotworld data.world Resource Graph Data Platform Catalog How scope affects your data catalog Analytics Catalog Approach Purpose Coverage Stakeholders Analytics Catalog Enabling Data Consumers discover assets Data Lake and Data Mart Tables and related Reports Analysts, BI Team, Report Writers, Report Users Data Platform Catalog + Enabling the management of Data Platform (automation and observability) + Upstream Data sources, lineage, streaming data, ml model, usage information + Data Scientist, Data Engineers Enterprise Resource Graph + Managing and protect the company’s data related resources + All data systems, services, classification, access and provenance. + Run Time Developers, Security, Privacy The approach to managing metadata will depend on the problems that are a priority to solve.
  • 16. 2. Who are the stakeholders?
  • 17. datadotworld data.world Capture and store what user data exists, where is it, and who is responsible for it? Privacy Tell me where is the sensitive data, how is it handled, who has access, who is responsible for it? Provide a platform to store and share data best practices, certifications, documentation, and curated data models. Tell me what data there is, its usability, how to use it and who to go to for help. Tell me who uses my data, and give me a platform to interact with them. Enable automation within data systems – registration, provisioning, validation, access controls, etc. Stakeholders Key to buy-in, executive sponsorship, and oversight. Security Platforms Data Governance Data Producers Data Consumers Data Leadership
  • 18. 3. Where should we standardize and productize data?
  • 19. datadotworld data.world What is a Data Product? “A product that facilitates an end goal through the use of data” DJ Patil, former United States Chief Data Scientist “Data as a product defines a new concept, called data product that embodies standardized characteristics to make data valuable and usable.” Zhamak Dehghani, Thoughtworks Director of Emerging Technologies and founder of data mesh
  • 20. datadotworld data.world Data Product ABCs Explicit Knowledge E ● Modeling Schemas ● Documentation ● Relationships with other Data Products Downstream Consumers D ● Current and Potential Consumers ● Use Cases ● Roadmap Contracts & Expectations C ● Data Constraints, Definitions, Tests ● SLAs, SLOs, Sharing Agreements, Consents, Purposes ● Performance, Scale, Maintainability, etc. Boundaries B ● What is it? What isn’t it? ● Where will it live? ● Inputs and Outputs Accountability A ● Who is the owner? ● Who defines the requirements? ● Who fixes it when it breaks?
  • 21. datadotworld data.world What is a Data Product? Data Producer A Internal Data API Data Product(s) Data Consumer B Data Consumer A Data Platform Dataset The Cloud-Native Data Catalog
  • 22. datadotworld data.world What is a Data Product? Data Producer A Internal Data API Data Product(s) Data Producer B Internal Data API Data Producer C Internal Data API Data Consumer C Data Consumer B Data Consumer A Data Platform Aggregate or “Enterprise” Data Product(s)
  • 23. Data Mesh Reference Architecture Domain: Customer Domain: Sales Domain: Products Domain: Marketing Domain: Customer 360 Inventory of shared data products Snowflake Reader Account Snowflake Data Cloud Consumers Interoperability Standards, Federated Governance, 3rd Party Tools Snowflake Data Sharing as the preferred interoperability standard. Data Marketplace makes data discoverable. Data Exchange / Catalog for Consumers • Connects providers to consumers • Inventory of available assets • No central storage of shared data • Providers retain full control over shared assets (data, functions) • Consumers access live provider data, no copies or ETL required. Register shared data for local SQL access in their environment (no copy) Data domains: • Can consume and share data or functions • Control access policies, data masking, etc. for downstream consumers • Can share external tables, i.e. provide access to data outside of Snowflake • Can provide reader accounts for non-Snowflake consumers Data Catalog for Producers: • Technical Metadata Inventory, Lineage, Sensitive Data, Business Glossary 3rd party marketing agency Reseller Sales Analysts Churn & Retention Business optimization Finance & Controlling Data Sources
  • 24. Global and Multi-Cloud Data Mesh Data Domain 1 Data Domain 2 Data Domain 3 Data Domain 5 Data Domain 4 Interoperability Standards, Federated Governance, 3rd Party Tools US East FRA Snowflake Reader Account Consumers Snowflake enables a truly global and multi-cloud data mesh across cloud platforms and regions. • Data sources, data domains, and consumers can sit in different regions and different cloud platforms • Snowflake enables a truly global and multi-cloud data mesh Tokyo Zurich Snowflake Data Cloud Data Sources Inventory of shared data products
  • 25. GOVERNANCE IN THE DATA CLOUD Know, protect, and unlock your data Know your data Protect your data Unlock your data Object Tagging Auto Classification Object Dependencies** Access History (writes)** Access History (data access audit) What Where Who Row Access Policies Dynamic Data Masking External Tokenization Conditional Masking Secure Data Sharing Data Exchange Data Marketplace Object Dependencies (impact analysis) Access History (data lineage)
  • 26. 4. Who is responsible?
  • 27. datadotworld data.world Who is responsible? Whether you call them data product managers, data stewards, data owners, data advocates, data custodians, or data trustees… Let’s revisit Accountability of the Data Product ABCs Framework: ● Who is the owner? ● Who defines the requirements? ● Who fixes it when it breaks? ● Who defines the roadmap? ● Who has the expertise? What are the fewest number of critical “hats to wear”?
  • 28. datadotworld data.world Data Producer Data Consumer Data Platform Data Engineering Data Producer Data Consumer Data Platform Data Management Changing the Paradigm Data Management as an Intermediary Direct Data Producer and Data Consumer Collaboration
  • 29. Data Mesh: Domain-centric Responsibility Domain: Customer Data sources from different domains Consumers Domain: Helpdesk & Support Domain: Products Interoperability Standards, Federated Governance, Data Catalog ELT ELT ETL ETL ETL Data Model Data Model ETL ETL ETL ETL ETL ETL ETL Domain: Orders & Sales Domain: Marketing & Promotions Domain: Customer 360 Data Consumption Data Management Data Integration Data Sources
  • 30. 5. How to be agile?
  • 31. datadotworld data.world The Cloud Data Catalog What is Agile Data Governance? The process of creating and improving data assets by iteratively capturing knowledge as data producers and consumers work together so that everyone can benefit. Empowering the usage of data safely. It adapts the deeply proven best practices of Agile and Open software development to data and analytics. datadotworld data.world The Cloud-Native Data Catalog
  • 32. datadotworld data.world Agile Data Governance Process: iterate! datadotworld data.world The Cloud Data Catalog datadotworld data.world The Cloud-Native Data Catalog
  • 33. datadotworld data.world The time impact of being fast, incremental, and iterative Define policies Release Refine Build workflows Define standards and principles Use Case 1 Define policies Release Build workflows Define standards/principles Analysis, Insight, Value Measure, Learn, Iterate Use Case 2 Define policies Release Build workflows Define standards/principles Analysis, Insight, Value Measure, Learn, Iterate Use Case 3 Define policies Release Build workflows Define standards/principles Analysis, Insight, Value Measure, Learn, Iterate Use Case 4 Define policies Release Build workflows Define standards/principles Analysis, Insight, Value Measure, Learn, Iterate datadotworld data.world The Cloud Data Catalog datadotworld data.world The Cloud-Native Data Catalog
  • 34. datadotworld data.world Takeaways What is the scope? ● Identify the Domains. You are already doing the work, they exist! ● Depends on the problems that are a priority to solve: Analytics, Data Platform, Enterprise Resources Who are the stakeholders of your data catalog? ● Always need Data Leadership ● Consumers, Producers, Governance, Privacy, Security, Platforms Where to Standardize/Productize Data? ● Data Product ABCs: Accountability, Boundaries, Contracts & Expectations, Downstream Consumers, Explicit Knowledge ● Consumption, Data Mgmt, Data Producing Systems Who is responsible? ● Accountability: Owner, Requirements, Who Fixes, Roadmap, Expertises ● Consumption, Data Mgmt, Data Producing Systems How to be agile? ● Empowering the usage of data safely. ● Develop a backlog of questions based on end user business value ● Sprints, Peer Review, Collaborate, Iterate The Cloud-Native Data Catalog
  • 35. Learn more about data mesh governance What’s inside? How to… ● Establish a framework for treating data as a product ● Find the right balance of decentralization and centralization ● Transform data into knowledge Download it here: data.world/resources/reports-and-tools/data-mesh-governance-white-paper datadotworld data.world The Cloud Data Catalog datadotworld data.world