SlideShare a Scribd company logo
Building Your Enterprise Data
Marketplace with DMX-h
Jennifer Cheplick
Sr. Director, Product Marketing
Today’s agenda
• The Need for an Enterprise Data Marketplace
• Attributes of a Successful Enterprise Data Marketplace
• Building an Enterprise Data Marketplace
• Potential Roadblocks
• How Syncsort Helps
3
Data Growth
(quintillion) bytes of
data created every day
of the world’s data generated
in the past two years alone
smart devices
projected by 20202.5Q 90% 200B
Data Delivers
Competitive
Advantage
“Compared with their
peers, high performers
report a greater variety
of actions to monetize
data – with greater
revenue impact”
- McKinsey Global Survey: Fueling growth
through data monetization
Enterprise Data Marketplace4
73.2%
Percentage of executives
whose firms have
achieved measurable
results from Big Data
and AI investments
- NewVantage Partners Big Data Executive
Survey 2018
$1.8 Trillion
Projected annual
revenue for insights-
driven businesses by
2021
- “Insights-Driven Businesses Set the Pace
for Global Growth,” Forrester, October 19,
2018
85%
Firms that leverage
customer behavioral
insights outperform peers
by 85 percent in sales
growth and 25 percent in
gross margin
- McKinsey Global Survey: Capturing value
from your customer data
Enterprise Data Marketplace5
Promise of a Data-Driven Culture
ACCURATE ANALYTICS & FASTER TIME-TO-VALUE
▪ Reduce bias, uncertainty, and misunderstanding
▪ Uncover new, previously inaccessible insights
▪ Accelerate speed of organizational decision-making
▪ Gain the most accurate, in-depth view of your customers
▪ Monitor and respond to customer activity in real-time
▪ Ensure confidence in regulatory reporting
▪ Identify and manage risk more quickly and completely
▪ Minimize time spent on manual data preparation
▪ Ensure accuracy of global operations and supply chain
TARGETED MARKETING & REVENUE GROWTH
OPERATIONAL EFFICIENCY & COST REDUCTION
REDUCED RISK & COMPLIANCE WITH CONFIDENCE
• Data has outgrown the
data warehouse
• Data lakes can be
polluted and chaotic
• Data is inconsistent
across data marts
Enterprise Data Marketplace6
• Every part of the
business demands
sophisticated data
analysis
• Departments need
access to the
company’s many
data sets,
combined in
different ways
• IT can’t be a
bottleneck
But most
organizations
are not getting
the full value of
their data
91% of organizations
have not yet reached
a “transformational”
level of maturity in
data and analytics
- Gartner
68% of IT professionals
state that data silos
negatively impact their
organization’s ability
to get value from their
data
The Rise of
The Enterprise
Data
Marketplace
• Enables data-driven
organizations
• Analytics teams and
business users can shop
and find the data they
need
• Data can be combined
for ever-expanding
applications
Overcomes the
limitations of previous
solutions to deliver
the best of each, in
one central repository
• Volume and variety of
the data lake
• Veracity and auditability
of the data warehouse
• Velocity and specificity
of purpose of the data
mart
Enterprise Data Marketplace7
Enables data-driven
organizations
Enterprise Data
Marketplace
Attributes:
Reliability
Provides a centralized location for
curated, trusted data, that it is:
• Clean
• Standardized
• Verified
Guardian Life Insurance
needed to enable Machine
Learning, visualization and BI
on broad range of datasets,
and reduce time-to-market for
analytics projects.
• Reduce data preparation,
transformation times
• Make data assets available to
whole enterprise – including
Mainframe data
Enterprise Data Marketplace8
Data Marketplace –
centralized, reusable, up-to-the-
minute current, searchable,
accessible, managed,
trustworthy data for analytics
Fast Time-to-Market
for new analytics and reporting
Enterprise Data
Marketplace
Attributes:
Flexibility
Pulls data from across the
enterprise and allows users to pick
and choose the data you need,
depending on what you want to
accomplish.
Progressive Insurance needs
cost-effective, easily accessible
operational data – including
Claims Liability, Policy,
Customer, Incident and more –
for advanced analytics
• Data marketplace includes 50
data sources
• More are added as business
needs evolve
Enterprise Data Marketplace9
Better Analytics – with readily
accessible, up-to-date data.
Fast Analytics Time-to-Market –
Data available in hours not days.
Audit Trails for Compliance
while keeping the EDW current
Low Archival Costs
Enterprise Data
Marketplace
Attributes:
Availability
Empower analytics teams to create
new data schemas on their own
• The right data sets are available
• Data is always up to date and
ready for various types of
analytics
• Removes wait times and IT
bottlenecks
Analysts at Symphony
Health no longer wait for
requests for specific data
schemas, or data subsets,
to work their way through
the IT team’s queue
Enterprise Data Marketplace10
“Before, part of the
data wasn’t available
for a day, and other
parts, not for a week.
Now it’s all available
for analysis within
minutes of the data
arriving.”
Robert Hathaway
Senior Manager Big Data
Today’s agenda
• The Need for an Enterprise Data Marketplace
• Attributes of a Successful Enterprise Data Marketplace
• Building an Enterprise Data Marketplace
• Potential Roadblocks
• How Syncsort Helps
Enterprise Data Marketplace12
Building an Enterprise Data Marketplace
Data Lake or Cloud
Raw Landing Zone
Access & Onboard – Elect to include data to understand
• What you don’t know CAN hurt you – e.g. bias
• If you’ve left it out, you cannot know it exists
• Data sets have more power to predict when combined
Enterprise Data Marketplace13
Building an Enterprise Data Marketplace
Data Lake or Cloud
Raw Landing Zone
Refined Zone
Refine – cleanse, enrich, de-duplicate
• What data needs refinement? – use cases will determine
• Each data set should be refined once – don’t repeat work
Enterprise Data Marketplace14
Building an Enterprise Data Marketplace
Data Lake or Cloud
Raw Landing Zone
Refined Zone
Track Provenance
• Data lineage documentation is necessary for establishing data
can be trusted, and for auditing, regulatory compliance
• Also, useful for reproducing steps in production machine
learning data pipelines
Enterprise Data Marketplace15
Building an Enterprise Data Marketplace
Data Lake or Cloud
Raw Landing Zone
Refined Zone
Shop for data sets, features & validate against your questions
• Analyst, data scientist shops for data
• What do I need for my purpose?
• Quality is already assured, provenance documented
• Improves trust, saves time
5 Potential Roadblocks to Building Your
Enterprise Data Warehouse
• Can be trapped in
hard-to-reach
systems like
mainframes, etc.
• Found in streams
in from POS, web
clicks, etc.
• Incompatible
formats, making it
difficult to gather
and prepare the
data for model
training.
Enterprise Data Marketplace16
Data Cleansing
at Scale
• Cleanse, enrich,
de-duplicate
• What data needs
refinement? – use
cases will
determine
• Each data set
should be refined
once – don’t
repeat work
Tracking
Lineage from
the Source
• Capture of
complete lineage,
from source to end
point – across
systems -- is
needed.
• Data changes made
to help train
models have to be
exactly duplicated
in production, in
order for models to
accurately make
predictions on new
data, and for
required audit
trails.
Entity
Resolution
• Matching across
massive datasets
that indicate a
single specific
entity (person,
company,
product, etc.)
• Requires
sophisticated
multi-field
matching
algorithms and a
lot of compute
power.
Siloed, Hard to
Reach Datasets
Ongoing Real-
Time Changed
Data Capture
• Tracking and
detection needs
to happen very
rapidly.
• Current
transactions need
to be constantly
added to
combined
datasets,
prepared and
presented to
models as close
to real-time as
possible.
Today’s agenda
• The Need for an Enterprise Data Marketplace
• Attributes of a Successful Enterprise Data Marketplace
• Building an Enterprise Data Marketplace
• Potential Roadblocks
• How Syncsort Helps
18
Build Your Enterprise Data Marketplace with Syncsort
Onboard ALL
enterprise
data.
Access
Join, transform,
cleanse, de-
duplicate batch
or streaming
data.
Integrate
Secure, govern,
manage and
monitor
everything.
Comply
Design once,
deploy anywhere.
Simplify
19
Simplify Big Data Integration with Syncsort
Simplify Big Data Integration
Onboard ALL
enterprise
data.
Access
Enterprise Data Marketplace20
Access & Integrate ALL Enterprise Data – Mainframe to Streaming
Data Sources
Onboard data, modify
on-the-fly to match
Hadoop storage model,
or store unchanged for
archive and compliance.
Access data from
streaming and batch
sources outside
cluster.
Cluster or Cloud
Data
Refine, transform, join,
cleanse, enhance
data in cluster or Cloud
with MapReduce,
EMR, or Spark.
Simplify Big Data Integration21
Comply: Govern and Track Everything for Compliance
• Metadata and data lineage for Hive, Avro and Parquet
through HCatalog
• Metadata lineage export and API from DMX/DMX-h
• Simplify audits, analytics dashboards, metrics
• Integrate with enterprise metadata repositories
• Cloudera Navigator certified integration
• Track lineage from source – even changes made off cluster
• HDFS, YARN, Spark and other metadata
• Lineage, tagging
• Business and structural metadata
• Apache Atlas ingestion lineage integration
• Lineage, tagging
• Track lineage from source – even changes made off cluster
DMX-h
Simplify Big Data Integration22
Comply: Secure the Entire Process
• Native Kerberos and LDAP support
• Kerberos-secured clusters
• Authenticated browsing
• Authenticated sampling
• Security certified
• Apache Ranger
• Apache Sentry
• FTPS, Connect:Direct secure data transfers
DMX-h
23
Simplify: Design Once, Deploy Anywhere
Simplify Big Data Integration
Intelligent Execution - Insulate your organization from underlying complexities of Hadoop.
Get excellent performance every time
without tuning, load balancing, etc.
No re-design, re-compile, no re-work ever
• Future-proof job designs for emerging
compute frameworks, e.g. Spark 2.x
• Move from dev to test to production
• Move from on-premise to Cloud
• Move from one Cloud to another
Use existing ETL skills
No parallel programming – Java, MapReduce, Spark …
No worries about:
• Mappers, Reducers
• Big side or small side of joins …
Design Once
in visual GUI
Deploy Anywhere!
On-Premise,
Cloud
Mapreduce, Spark,
Future Platforms
Windows, Unix,
Linux
Batch,
Streaming
Single Node,
Cluster
Trillium Quality for Big Data – Data Cleansing at Scale
Boost effectiveness of machine learning, AI with complete, standardized data.
1. Visually create and test data
quality processes locally
2. Execute in MapReduce or Spark
On premise or in the Cloud
Build Your
Enterprise
Data
Warehouse
with Syncsort
“Ingestion has
gone from
days to hours”
- Progressive Big Data Tech
Lead
“DMX-h is already
optimized. We use
its Intelligent
Execution and it
just performs.”
Enterprise Data Marketplace25
“DMX-h is already
optimized. We use
its Intelligent
Execution and it
just performs.”
- Robert Hathaway
Senior Manager Big Data,
Symphony Health
“We found DMX-h
to be very usable
and easy to ramp
up in terms of
skills. Most of all,
Syncsort has been
a very good
partner in terms
of support and
listening to our
needs.”
- Alex Rosenthal, Enterprise
Data Office, Guardian Life
Insurance
Visit
www.syncsort.com
to learn more
Enterprise Data Marketplace26

More Related Content

What's hot (20)

PPT
Data Governance
Rob Lux
 
PDF
Modern Data architecture Design
Kujambu Murugesan
 
PDF
The Importance of Metadata
DATAVERSITY
 
PDF
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Tristan Baker
 
PDF
Data Governance Powerpoint Presentation Slides
SlideTeam
 
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
PDF
Data Architecture Best Practices for Advanced Analytics
DATAVERSITY
 
PDF
Data Marketplace and the Role of Data Virtualization
Denodo
 
PDF
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DATAVERSITY
 
PDF
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...
DATAVERSITY
 
PDF
Building a Data Governance Strategy
Analytics8
 
PPTX
Data Quality & Data Governance
Tuba Yaman Him
 
PDF
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
DATAVERSITY
 
PDF
Data Modeling is Data Governance
DATAVERSITY
 
PDF
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
PDF
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
PDF
Data Services Marketplace
Denodo
 
PDF
Data, Information And Knowledge Management Framework And The Data Management ...
Alan McSweeney
 
PDF
DMBOK 2.0 and other frameworks including TOGAF & COBIT - keynote from DAMA Au...
Christopher Bradley
 
PDF
Real-World Data Governance: What is a Data Steward and What Do They Do?
DATAVERSITY
 
Data Governance
Rob Lux
 
Modern Data architecture Design
Kujambu Murugesan
 
The Importance of Metadata
DATAVERSITY
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Tristan Baker
 
Data Governance Powerpoint Presentation Slides
SlideTeam
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Data Architecture Best Practices for Advanced Analytics
DATAVERSITY
 
Data Marketplace and the Role of Data Virtualization
Denodo
 
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DATAVERSITY
 
DataEd Online: Data Architecture and Data Modeling Differences — Achieving a ...
DATAVERSITY
 
Building a Data Governance Strategy
Analytics8
 
Data Quality & Data Governance
Tuba Yaman Him
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
DATAVERSITY
 
Data Modeling is Data Governance
DATAVERSITY
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
Data Services Marketplace
Denodo
 
Data, Information And Knowledge Management Framework And The Data Management ...
Alan McSweeney
 
DMBOK 2.0 and other frameworks including TOGAF & COBIT - keynote from DAMA Au...
Christopher Bradley
 
Real-World Data Governance: What is a Data Steward and What Do They Do?
DATAVERSITY
 

Similar to Building Your Enterprise Data Marketplace with DMX-h (20)

PPTX
How to Capitalize on Big Data with Oracle Analytics Cloud
Perficient, Inc.
 
PDF
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Precisely
 
PPTX
Deliveinrg explainable AI
Gary Allemann
 
PPTX
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Vishal Bamba
 
PDF
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Denodo
 
PDF
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
PPTX
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
PDF
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Precisely
 
PPTX
How 360 Degree Data Integration Enables the Customer-centric Business
Astera Software
 
PDF
When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
PDF
Foundational Strategies for Trust in Big Data Part 3: Data Lineage
Precisely
 
PPTX
Data Mashups for Analytics
Pentaho
 
PPTX
Data Mashups for Analytics
Katharine Bierce
 
PDF
MT101 Dell OCIO: Delivering data and analytics in real time
Dell EMC World
 
PPT
CHAPTER 2.ppt
Pradeep513562
 
PPT
Data mining wrhousing-lec
Ravi Foods Pvt. Ltd. (DUKES)
 
PDF
ADV Slides: Data Pipelines in the Enterprise and Comparison
DATAVERSITY
 
PDF
Turning Big Data into Better Business Outcomes
Cisco Canada
 
PDF
How PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
Hortonworks
 
PDF
Increasing Agility Through Data Virtualization
Denodo
 
How to Capitalize on Big Data with Oracle Analytics Cloud
Perficient, Inc.
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Precisely
 
Deliveinrg explainable AI
Gary Allemann
 
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Vishal Bamba
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Denodo
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Precisely
 
How 360 Degree Data Integration Enables the Customer-centric Business
Astera Software
 
When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Foundational Strategies for Trust in Big Data Part 3: Data Lineage
Precisely
 
Data Mashups for Analytics
Pentaho
 
Data Mashups for Analytics
Katharine Bierce
 
MT101 Dell OCIO: Delivering data and analytics in real time
Dell EMC World
 
CHAPTER 2.ppt
Pradeep513562
 
Data mining wrhousing-lec
Ravi Foods Pvt. Ltd. (DUKES)
 
ADV Slides: Data Pipelines in the Enterprise and Comparison
DATAVERSITY
 
Turning Big Data into Better Business Outcomes
Cisco Canada
 
How PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
Hortonworks
 
Increasing Agility Through Data Virtualization
Denodo
 
Ad

More from Precisely (20)

PDF
Solving the Data Disconnect: Why Success Hinges on Pre-Linked Data.pdf
Precisely
 
PDF
Cooking Up Clean Addresses - 3 Ways to Whip Messy Data into Shape.pdf
Precisely
 
PDF
Building Confidence in AI & Analytics with High-Integrity Location Data.pdf
Precisely
 
PDF
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
Precisely
 
PDF
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely
 
PDF
The 2025 Guide on What's Next for Automation.pdf
Precisely
 
PDF
Outdated Tech, Invisible Expenses – How Data Silos Undermine Operational Effi...
Precisely
 
PDF
Modernización de SAP: Maximizando el Valor de su Migración a SAP S/4HANA.pdf
Precisely
 
PDF
Outdated Tech, Invisible Expenses – The Hidden Cost of Disconnected Data Syst...
Precisely
 
PDF
Migration vers SAP S/4HANA: Un levier stratégique pour votre transformation d...
Precisely
 
PDF
Outdated Tech, Invisible Expenses: The Hidden Cost of Poor Data Integration o...
Precisely
 
PDF
The Changing Compliance Landscape in 2025.pdf
Precisely
 
PDF
AI You Can Trust: The Critical Role of Governance and Quality.pdf
Precisely
 
PDF
Automate Studio Training: Building Scripts for SAP Fiori and GUI for HTML.pdf
Precisely
 
PDF
Unlocking the Power of Trusted Data for AI, Analytics, and Business Growth.pdf
Precisely
 
PDF
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
PDF
End-to-end process automation: Simplifying SAP master data with low-code/no-c...
Precisely
 
PDF
Optimizing Your IBM i Availability: Storage vs. Software Replication.pdf
Precisely
 
PDF
AI You Can Trust - The Role of Data Integrity in AI-Readiness.pdf
Precisely
 
PDF
Top Tips to Get Your Data AI-Ready‎ ‎ ‎‎ ‎
Precisely
 
Solving the Data Disconnect: Why Success Hinges on Pre-Linked Data.pdf
Precisely
 
Cooking Up Clean Addresses - 3 Ways to Whip Messy Data into Shape.pdf
Precisely
 
Building Confidence in AI & Analytics with High-Integrity Location Data.pdf
Precisely
 
SAP Modernization Strategies for a Successful S/4HANA Journey.pdf
Precisely
 
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely
 
The 2025 Guide on What's Next for Automation.pdf
Precisely
 
Outdated Tech, Invisible Expenses – How Data Silos Undermine Operational Effi...
Precisely
 
Modernización de SAP: Maximizando el Valor de su Migración a SAP S/4HANA.pdf
Precisely
 
Outdated Tech, Invisible Expenses – The Hidden Cost of Disconnected Data Syst...
Precisely
 
Migration vers SAP S/4HANA: Un levier stratégique pour votre transformation d...
Precisely
 
Outdated Tech, Invisible Expenses: The Hidden Cost of Poor Data Integration o...
Precisely
 
The Changing Compliance Landscape in 2025.pdf
Precisely
 
AI You Can Trust: The Critical Role of Governance and Quality.pdf
Precisely
 
Automate Studio Training: Building Scripts for SAP Fiori and GUI for HTML.pdf
Precisely
 
Unlocking the Power of Trusted Data for AI, Analytics, and Business Growth.pdf
Precisely
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
End-to-end process automation: Simplifying SAP master data with low-code/no-c...
Precisely
 
Optimizing Your IBM i Availability: Storage vs. Software Replication.pdf
Precisely
 
AI You Can Trust - The Role of Data Integrity in AI-Readiness.pdf
Precisely
 
Top Tips to Get Your Data AI-Ready‎ ‎ ‎‎ ‎
Precisely
 
Ad

Recently uploaded (20)

PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Biography of Daniel Podor.pdf
Daniel Podor
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 

Building Your Enterprise Data Marketplace with DMX-h

  • 1. Building Your Enterprise Data Marketplace with DMX-h Jennifer Cheplick Sr. Director, Product Marketing
  • 2. Today’s agenda • The Need for an Enterprise Data Marketplace • Attributes of a Successful Enterprise Data Marketplace • Building an Enterprise Data Marketplace • Potential Roadblocks • How Syncsort Helps
  • 3. 3 Data Growth (quintillion) bytes of data created every day of the world’s data generated in the past two years alone smart devices projected by 20202.5Q 90% 200B
  • 4. Data Delivers Competitive Advantage “Compared with their peers, high performers report a greater variety of actions to monetize data – with greater revenue impact” - McKinsey Global Survey: Fueling growth through data monetization Enterprise Data Marketplace4 73.2% Percentage of executives whose firms have achieved measurable results from Big Data and AI investments - NewVantage Partners Big Data Executive Survey 2018 $1.8 Trillion Projected annual revenue for insights- driven businesses by 2021 - “Insights-Driven Businesses Set the Pace for Global Growth,” Forrester, October 19, 2018 85% Firms that leverage customer behavioral insights outperform peers by 85 percent in sales growth and 25 percent in gross margin - McKinsey Global Survey: Capturing value from your customer data
  • 5. Enterprise Data Marketplace5 Promise of a Data-Driven Culture ACCURATE ANALYTICS & FASTER TIME-TO-VALUE ▪ Reduce bias, uncertainty, and misunderstanding ▪ Uncover new, previously inaccessible insights ▪ Accelerate speed of organizational decision-making ▪ Gain the most accurate, in-depth view of your customers ▪ Monitor and respond to customer activity in real-time ▪ Ensure confidence in regulatory reporting ▪ Identify and manage risk more quickly and completely ▪ Minimize time spent on manual data preparation ▪ Ensure accuracy of global operations and supply chain TARGETED MARKETING & REVENUE GROWTH OPERATIONAL EFFICIENCY & COST REDUCTION REDUCED RISK & COMPLIANCE WITH CONFIDENCE
  • 6. • Data has outgrown the data warehouse • Data lakes can be polluted and chaotic • Data is inconsistent across data marts Enterprise Data Marketplace6 • Every part of the business demands sophisticated data analysis • Departments need access to the company’s many data sets, combined in different ways • IT can’t be a bottleneck But most organizations are not getting the full value of their data 91% of organizations have not yet reached a “transformational” level of maturity in data and analytics - Gartner 68% of IT professionals state that data silos negatively impact their organization’s ability to get value from their data
  • 7. The Rise of The Enterprise Data Marketplace • Enables data-driven organizations • Analytics teams and business users can shop and find the data they need • Data can be combined for ever-expanding applications Overcomes the limitations of previous solutions to deliver the best of each, in one central repository • Volume and variety of the data lake • Veracity and auditability of the data warehouse • Velocity and specificity of purpose of the data mart Enterprise Data Marketplace7 Enables data-driven organizations
  • 8. Enterprise Data Marketplace Attributes: Reliability Provides a centralized location for curated, trusted data, that it is: • Clean • Standardized • Verified Guardian Life Insurance needed to enable Machine Learning, visualization and BI on broad range of datasets, and reduce time-to-market for analytics projects. • Reduce data preparation, transformation times • Make data assets available to whole enterprise – including Mainframe data Enterprise Data Marketplace8 Data Marketplace – centralized, reusable, up-to-the- minute current, searchable, accessible, managed, trustworthy data for analytics Fast Time-to-Market for new analytics and reporting
  • 9. Enterprise Data Marketplace Attributes: Flexibility Pulls data from across the enterprise and allows users to pick and choose the data you need, depending on what you want to accomplish. Progressive Insurance needs cost-effective, easily accessible operational data – including Claims Liability, Policy, Customer, Incident and more – for advanced analytics • Data marketplace includes 50 data sources • More are added as business needs evolve Enterprise Data Marketplace9 Better Analytics – with readily accessible, up-to-date data. Fast Analytics Time-to-Market – Data available in hours not days. Audit Trails for Compliance while keeping the EDW current Low Archival Costs
  • 10. Enterprise Data Marketplace Attributes: Availability Empower analytics teams to create new data schemas on their own • The right data sets are available • Data is always up to date and ready for various types of analytics • Removes wait times and IT bottlenecks Analysts at Symphony Health no longer wait for requests for specific data schemas, or data subsets, to work their way through the IT team’s queue Enterprise Data Marketplace10 “Before, part of the data wasn’t available for a day, and other parts, not for a week. Now it’s all available for analysis within minutes of the data arriving.” Robert Hathaway Senior Manager Big Data
  • 11. Today’s agenda • The Need for an Enterprise Data Marketplace • Attributes of a Successful Enterprise Data Marketplace • Building an Enterprise Data Marketplace • Potential Roadblocks • How Syncsort Helps
  • 12. Enterprise Data Marketplace12 Building an Enterprise Data Marketplace Data Lake or Cloud Raw Landing Zone Access & Onboard – Elect to include data to understand • What you don’t know CAN hurt you – e.g. bias • If you’ve left it out, you cannot know it exists • Data sets have more power to predict when combined
  • 13. Enterprise Data Marketplace13 Building an Enterprise Data Marketplace Data Lake or Cloud Raw Landing Zone Refined Zone Refine – cleanse, enrich, de-duplicate • What data needs refinement? – use cases will determine • Each data set should be refined once – don’t repeat work
  • 14. Enterprise Data Marketplace14 Building an Enterprise Data Marketplace Data Lake or Cloud Raw Landing Zone Refined Zone Track Provenance • Data lineage documentation is necessary for establishing data can be trusted, and for auditing, regulatory compliance • Also, useful for reproducing steps in production machine learning data pipelines
  • 15. Enterprise Data Marketplace15 Building an Enterprise Data Marketplace Data Lake or Cloud Raw Landing Zone Refined Zone Shop for data sets, features & validate against your questions • Analyst, data scientist shops for data • What do I need for my purpose? • Quality is already assured, provenance documented • Improves trust, saves time
  • 16. 5 Potential Roadblocks to Building Your Enterprise Data Warehouse • Can be trapped in hard-to-reach systems like mainframes, etc. • Found in streams in from POS, web clicks, etc. • Incompatible formats, making it difficult to gather and prepare the data for model training. Enterprise Data Marketplace16 Data Cleansing at Scale • Cleanse, enrich, de-duplicate • What data needs refinement? – use cases will determine • Each data set should be refined once – don’t repeat work Tracking Lineage from the Source • Capture of complete lineage, from source to end point – across systems -- is needed. • Data changes made to help train models have to be exactly duplicated in production, in order for models to accurately make predictions on new data, and for required audit trails. Entity Resolution • Matching across massive datasets that indicate a single specific entity (person, company, product, etc.) • Requires sophisticated multi-field matching algorithms and a lot of compute power. Siloed, Hard to Reach Datasets Ongoing Real- Time Changed Data Capture • Tracking and detection needs to happen very rapidly. • Current transactions need to be constantly added to combined datasets, prepared and presented to models as close to real-time as possible.
  • 17. Today’s agenda • The Need for an Enterprise Data Marketplace • Attributes of a Successful Enterprise Data Marketplace • Building an Enterprise Data Marketplace • Potential Roadblocks • How Syncsort Helps
  • 18. 18 Build Your Enterprise Data Marketplace with Syncsort Onboard ALL enterprise data. Access Join, transform, cleanse, de- duplicate batch or streaming data. Integrate Secure, govern, manage and monitor everything. Comply Design once, deploy anywhere. Simplify
  • 19. 19 Simplify Big Data Integration with Syncsort Simplify Big Data Integration Onboard ALL enterprise data. Access
  • 20. Enterprise Data Marketplace20 Access & Integrate ALL Enterprise Data – Mainframe to Streaming Data Sources Onboard data, modify on-the-fly to match Hadoop storage model, or store unchanged for archive and compliance. Access data from streaming and batch sources outside cluster. Cluster or Cloud Data Refine, transform, join, cleanse, enhance data in cluster or Cloud with MapReduce, EMR, or Spark.
  • 21. Simplify Big Data Integration21 Comply: Govern and Track Everything for Compliance • Metadata and data lineage for Hive, Avro and Parquet through HCatalog • Metadata lineage export and API from DMX/DMX-h • Simplify audits, analytics dashboards, metrics • Integrate with enterprise metadata repositories • Cloudera Navigator certified integration • Track lineage from source – even changes made off cluster • HDFS, YARN, Spark and other metadata • Lineage, tagging • Business and structural metadata • Apache Atlas ingestion lineage integration • Lineage, tagging • Track lineage from source – even changes made off cluster DMX-h
  • 22. Simplify Big Data Integration22 Comply: Secure the Entire Process • Native Kerberos and LDAP support • Kerberos-secured clusters • Authenticated browsing • Authenticated sampling • Security certified • Apache Ranger • Apache Sentry • FTPS, Connect:Direct secure data transfers DMX-h
  • 23. 23 Simplify: Design Once, Deploy Anywhere Simplify Big Data Integration Intelligent Execution - Insulate your organization from underlying complexities of Hadoop. Get excellent performance every time without tuning, load balancing, etc. No re-design, re-compile, no re-work ever • Future-proof job designs for emerging compute frameworks, e.g. Spark 2.x • Move from dev to test to production • Move from on-premise to Cloud • Move from one Cloud to another Use existing ETL skills No parallel programming – Java, MapReduce, Spark … No worries about: • Mappers, Reducers • Big side or small side of joins … Design Once in visual GUI Deploy Anywhere! On-Premise, Cloud Mapreduce, Spark, Future Platforms Windows, Unix, Linux Batch, Streaming Single Node, Cluster
  • 24. Trillium Quality for Big Data – Data Cleansing at Scale Boost effectiveness of machine learning, AI with complete, standardized data. 1. Visually create and test data quality processes locally 2. Execute in MapReduce or Spark On premise or in the Cloud
  • 25. Build Your Enterprise Data Warehouse with Syncsort “Ingestion has gone from days to hours” - Progressive Big Data Tech Lead “DMX-h is already optimized. We use its Intelligent Execution and it just performs.” Enterprise Data Marketplace25 “DMX-h is already optimized. We use its Intelligent Execution and it just performs.” - Robert Hathaway Senior Manager Big Data, Symphony Health “We found DMX-h to be very usable and easy to ramp up in terms of skills. Most of all, Syncsort has been a very good partner in terms of support and listening to our needs.” - Alex Rosenthal, Enterprise Data Office, Guardian Life Insurance Visit www.syncsort.com to learn more