SlideShare a Scribd company logo
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Best Practices in DataOps
How to Create Agile, Automated Data Pipelines
Wayne W. Eckerson
May 8, 2019
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
1. Your data team is flooded with minor request tickets and is burning out.
2. Business users don’t trust the data because it contains too many errors.
3. Source system changes keep breaking your ETL jobs and data pipelines.
4. Business users don’t understand why it takes so long to get data.
5. You have difficulty meeting service level agreements (SLAs).
6. Data analysts write the same jobs and reports with minor variations.
7. Data scientists wait for months for data and computing resources
8. Your company can’t discern the true cost of migrating to the cloud
9. Your data environment is too chaotic to implement predictive analytics
10. Your self-service initiative has spawned hundreds of data silos.
Bonus: Your data lake is more of a data swamp.
Bonus: It’s takes months to deploy a single predictive model.
10 Symptoms You Need DataOps
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
What is DataOps?
LeanTQM
Agile Dev/Ops
• Scrum, Kanban
• Business engagement
• Self-organizing teams
• Retrospectives
• Automation
• Orchestration
• Efficiency
• Simplicity
• Team-based development
• Version control
• Continuous integration/ delivery
• Test-driven development
• Performance management
• Performance metrics
• Continuous monitoring
• Benchmarking
DataOps
“A set of practices, processes, and technologies for building, operationalizing,
automating, and managing data pipelines from source to consumption.”
DataOps = Data Operations
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
DataOps History
DataOps applies rigor of software engineering to the
development and execution of data pipelines.
“Cowboy
Coders”
Team-based
Development
DevOps-based
Development
1960s 1970s 1980s 1990s 2000s 2010s 2020s
First DevOps
event (2009)
Manifesto for Agile Software
Development published (2001)
DevOps DataOps
KEY:
DataOps Manifesto
published (2017)
First DataOps
Event (2019)
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Primary Use Cases
Big Data Data Science Self Service
Data
Warehousing
Standardize and
reuse core data
pipeline components:
ingest, transform,
clean, etc.
Create data science
sandboxes on
demand; deploy
models automatically;
monitor data drift.
Centralize logic and
permissions to
facilitate data access
and analysis while
eliminating data silos
Speed development
by assigning agile
teams to business
groups to build end-
to-end solutions
Agile but ungoverned Governed but not agile
Reuse and
Collaboration
Self Service and
Automation
Governance and
Infrastructure
Speed and
Prioritization
Biggest
Needs
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Adoption
Yes
27%
Somewhat
30%
No
43%
DOES YOUR ORGANIZATION HAVE A
DATAOPS INITIATIVE?
Based on 175 respondents from an Eckerson Group
survey conducted in April, 2019.
32%
29%
10%
9%
7%
6%
5%
2%
1%
0%
IT OR BI DIRE C T OR OR MA NA GE R
IT OR BI A RC H IT E C T ,…
C ONS ULT A NT
BUS INE S S MA NA GE R - A NA LY T IC S
DA T A A NA LY S T OR S C IE NT IS T
BUS INE S S E X E C UT IVE OR …
DA T A E NGINE E R
A C A DE MIC
VE NDOR
DA T A OP S E NGINE E R
RESPONDENT ROLES
18%
15%
11%
29%
26%
VE RY S MA LL < 100 …
S MA LL <500 …
ME DIUM <1, 000 E MP …
LA RGE <10, 000
VE RY LA RGE > …
COMPANY SIZE
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Benefits
Faster cycle time
Fewer data defects
More scalability, reliability
Lower costs
More innovation
Happier customers
Continuous integration/delivery, reuse, automation
Test-driven development and execution
Team-based development, continuous monitoring
Higher development capacity, fewer errors
Focus efforts on value-add solutions and technologies
Get more for less with greater trust and alignment
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Benefits from Survey
60%
55%
50%
50%
48%
47%
47%
42%
FA S TE R CY CLE TI ME S
HA P P I E R B US I NE S S US E RS
DE LI V E R NE W A P P LI CA TI ONS MORE QUI CK LY
FE W E R DE FE CTS A ND E RRORS
I NGE S T NE W DA TA S OURCE S MORE RA P I DLY
FASTER CHANGE REQUESTS
I NCRE A S E D DE V E LOP ME NT CA P A CI TY
I MP ROV E D DA TA GOV E RNA NCE
BENEFITS OF DATAOPS
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Challenges
55%
53%
50%
50%
47%
42%
35%
34%
26%
23%
E S TA B LI S HING FORMA L P ROCE S S E S
ORCHE S TRA TI NG CODE A ND DA TA A CROS S TOOLS
STAFF CAPACI TY
MONI TORI NG THE E ND -T O-E ND E NV I RONME NT
BUI LDI NG RI GOROUS TESTS UPFRONT
LA CK OF A DE QUA TE A UTOMA TI ON TOOLS
GE TTI NG B US I NE S S US E RS TO B UY I NTO THE
P ROCE S S
A DOP TI NG A GI LE ME THODS A ND TE A MS
DA TA I S TOO HA RD TO FI ND
GETTI NG TECHNI CAL USERS TO BUY I N TO THE
P ROCE S S
DATAOPS CHALLENGES
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Components and Tools
58%
54%
53%
50%
50%
46%
46%
41%
38%
32%
28%
A GI LE DE V E LOP ME NT
CONTI NUOUS DE LI V E RY
COLLA B ORA TI ON A ND RE US E
CONTI NUOUS I NTE GRA TI ON
CODE RE P OS I TORY
DATA PI PELI NE ORCHESTRATI ON
P E RFORMA NCE A ND A P P LI CA TI ON MONI TORI NG
CONTI NUOUS TE S TI NG
W ORKFLOW MA NA GE ME NT
CHA NGE MA NA GE ME NT RE QUE S T
CONTA I NE RS A ND ORCHE S TRA TI ON TOOLS
RATE THE IMPORTANCE OF EACH DATAOPS
COMPONENT?
High
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Use Cases
66%
60%
56%
52%
39%
29%
34%
27%
DA T A W A RE H OUS E S A ND MA RT S
RE P ORT ING A ND DA S H BOA RDING
S E LF - S E RVIC E A NA LY S IS
DA T A S C IE NC E A ND MA C H INE LE A RNING
DA T A LA K E
OLA P C UBE S F OR RE P ORT ING A ND A NA LY S IS
C US T OME R- F A CING A P P LIC A T IONS
A UDIT , C OMP LIA NC E , S E C URIT Y
DATAOPS USE CASES
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Best Practices
Form a data department (with a CDO)
Map and assess your data environment
Educate your team about DataOps
Create cross-functional dev teams
Align the teams with business priorities
Continuously review and refine processes
The “Soft
Stuff”
If you don’t have one already Add a CDO for executive clout
Map data flows; assess waste, inefficiencies,
manual processes, error sources, dev capacity.
Expect resistance: ”Data is different!” “Don’t
slow us down!”
Stick with it; “You can’t drive fast w/o brakes.”
Self-organizing, cross-trained; collaborative, agile
teams that build end-to-end solutions
Align agile themes, initiatives, epics, and stories
with business goals; get cross-functional priorities
It’s a journey; benchmark performance and
continuously improve cycle times, capacity,
reuse, and other core objectives.
Pull ”data people” out of IT; unite data engineers,
data scientists, and SW engineers.
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Best Practices (cont)
Start small and build incrementally
Build for reuse
Segregate duties and environments
Test and monitor everything
Use DevOps and DataOps tools
Create a self-service infrastructure
Build for the enterprise
The “Hard
Stuff”
Standardize ingest, transforms, configurations,
code, data sets; use repositories & containers.
Use tools to migrate code from dev to test, to
production environments and segregate duties
Build tests before and after coding; use tests to
monitor and automate data pipelines.
Repositories for data, code, configurations; tools
for agile collaboration, CI/CD, testing, data
catalog, orchestration, data glossary, unification.
Centralize logic; apply permissions for data
access and functionality; automate report and
model deployment; serverless, Kubernetes,
Plan for security, governance, auditability,
scalability, reliability, portability, and continuous
monitoring.
Insist on business representation on the dev
team; get cross-functional priorities monthly
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Summary
DataOps puts
your data on a
solid foundation
• Speeds cycle time,
improves quality,
increases capacity,
reduces cost
Lets your data
team focus on
value-add
• Such as
predictive
analytics,
streaming data,
cloud computing
Increasing
customer
satisfaction and
business value
DataOps is light—
out, automated
data operations.
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Questions?
I’m listening!
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Wayne Eckerson
• 25+ year thought leader in data and analytics
• Sought-after speaker and consultant
• President, Eckerson Group
• Former director of research at TDWI
• Author of hundreds of articles and reports
Performance
Management
BI/Analytics
© Eckerson Group 2019 Twitter: @weckerson www.eckerson.com
Get More Value from
Data and Analytics

More Related Content

What's hot (20)

PDF
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
PPTX
Databricks Platform.pptx
Alex Ivy
 
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
PDF
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
PPTX
ODSC May 2019 - The DataOps Manifesto
DataKitchen
 
PDF
8 Steps to Creating a Data Strategy
Silicon Valley Data Science
 
PPTX
Azure data platform overview
James Serra
 
PDF
Data Mesh
Piethein Strengholt
 
PPTX
The Importance of DataOps in a Multi-Cloud World
DATAVERSITY
 
PPTX
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
PDF
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
PDF
Data Platform Architecture Principles and Evaluation Criteria
ScyllaDB
 
PDF
Introdution to Dataops and AIOps (or MLOps)
Adrien Blind
 
PDF
Data Catalog as the Platform for Data Intelligence
Alation
 
PPTX
Screw DevOps, Let's Talk DataOps
Kellyn Pot'Vin-Gorman
 
PDF
Enterprise Architecture vs. Data Architecture
DATAVERSITY
 
PDF
Time to Talk about Data Mesh
LibbySchulze
 
PDF
Data platform architecture
Sudheer Kondla
 
PDF
Implementing Effective Data Governance
Christopher Bradley
 
PDF
Data Governance Best Practices
DATAVERSITY
 
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Databricks Platform.pptx
Alex Ivy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
ODSC May 2019 - The DataOps Manifesto
DataKitchen
 
8 Steps to Creating a Data Strategy
Silicon Valley Data Science
 
Azure data platform overview
James Serra
 
The Importance of DataOps in a Multi-Cloud World
DATAVERSITY
 
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
Data Platform Architecture Principles and Evaluation Criteria
ScyllaDB
 
Introdution to Dataops and AIOps (or MLOps)
Adrien Blind
 
Data Catalog as the Platform for Data Intelligence
Alation
 
Screw DevOps, Let's Talk DataOps
Kellyn Pot'Vin-Gorman
 
Enterprise Architecture vs. Data Architecture
DATAVERSITY
 
Time to Talk about Data Mesh
LibbySchulze
 
Data platform architecture
Sudheer Kondla
 
Implementing Effective Data Governance
Christopher Bradley
 
Data Governance Best Practices
DATAVERSITY
 

Similar to Best Practices in DataOps: How to Create Agile, Automated Data Pipelines (20)

PDF
Best practices in data ops
Bilash Kumar Dash
 
PDF
A Detailed Guide To DataOps
Enov8
 
PPTX
DataOps Best Practices for Real-Time Big Data Management
prasannaprodevbase
 
PDF
How Can You Implement DataOps In Your Existing Workflow?
Enov8
 
PPTX
Should You Invest In DataOps Services?
Enov8
 
PDF
Streamline Your Data Workflows with DataOps for Better Efficiency.pdf
unicloudm
 
PDF
Should You Integrate DataOps in Your Business Process?
Enov8
 
PDF
Starting Your Modern DataOps Journey
CloverDX
 
PPTX
Everything you wanted to know about data ops
Enov8
 
PDF
How Do You Build Data Pipelines that Are Agile, Automated, and Accurate?
Precisely
 
PPTX
Your Data Nerd Friends Need You!
DataKitchen
 
PPTX
How to develop Data_Strategy_Webinar_Final.pptx
pelibax444
 
PDF
Creating a Successful DataOps Framework for Your Business.pdf
Enov8
 
PPTX
Washington DC DataOps Meetup -- Nov 2019
DataKitchen
 
PPTX
DataOps: Nine steps to transform your data science impact Strata London May 18
Harvinder Atwal
 
PPTX
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
TamrMarketing
 
PPTX
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
Harvinder Atwal
 
PPTX
Data Ops: Relevance To The Data Administration
Enov8
 
PDF
What is DataOps Platform? Why your team needs it?
Enov8
 
PPTX
Data summit connect fall 2020 - rise of data ops
Ryan Gross
 
Best practices in data ops
Bilash Kumar Dash
 
A Detailed Guide To DataOps
Enov8
 
DataOps Best Practices for Real-Time Big Data Management
prasannaprodevbase
 
How Can You Implement DataOps In Your Existing Workflow?
Enov8
 
Should You Invest In DataOps Services?
Enov8
 
Streamline Your Data Workflows with DataOps for Better Efficiency.pdf
unicloudm
 
Should You Integrate DataOps in Your Business Process?
Enov8
 
Starting Your Modern DataOps Journey
CloverDX
 
Everything you wanted to know about data ops
Enov8
 
How Do You Build Data Pipelines that Are Agile, Automated, and Accurate?
Precisely
 
Your Data Nerd Friends Need You!
DataKitchen
 
How to develop Data_Strategy_Webinar_Final.pptx
pelibax444
 
Creating a Successful DataOps Framework for Your Business.pdf
Enov8
 
Washington DC DataOps Meetup -- Nov 2019
DataKitchen
 
DataOps: Nine steps to transform your data science impact Strata London May 18
Harvinder Atwal
 
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
TamrMarketing
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
Harvinder Atwal
 
Data Ops: Relevance To The Data Administration
Enov8
 
What is DataOps Platform? Why your team needs it?
Enov8
 
Data summit connect fall 2020 - rise of data ops
Ryan Gross
 
Ad

More from Eric Kavanagh (20)

PPTX
The Future of Data Warehousing and Data Integration
Eric Kavanagh
 
PPTX
Expediting the Path to Discovery with Multi-Source Analysis
Eric Kavanagh
 
PPTX
Will AI Eliminate Reports and Dashboards
Eric Kavanagh
 
PPTX
Metadata Mastery: A Big Step for BI Modernization
Eric Kavanagh
 
PDF
Horses for Courses: Database Roundtable
Eric Kavanagh
 
PDF
Database Survival Guide: Exploratory Webcast
Eric Kavanagh
 
PDF
Better to Ask Permission? Best Practices for Privacy and Security
Eric Kavanagh
 
PDF
The Model Enterprise: A Blueprint for Enterprise Data Governance
Eric Kavanagh
 
PDF
Best Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Eric Kavanagh
 
PDF
A Winning Strategy for the Digital Economy
Eric Kavanagh
 
PDF
Discovering Big Data in the Fog: Why Catalogs Matter
Eric Kavanagh
 
PDF
Health Check: Maintaining Enterprise BI
Eric Kavanagh
 
PDF
Rapid Response: Debugging and Profiling to the Rescue
Eric Kavanagh
 
PDF
Solving the Really Big Tech Problems with IoT
Eric Kavanagh
 
PDF
Beyond the Platform: Enabling Fluid Analysis
Eric Kavanagh
 
PDF
Protect Your Database: High Availability for High Demand Data
Eric Kavanagh
 
PDF
A Better Understanding: Solving Business Challenges with Data
Eric Kavanagh
 
PDF
The Key to Effective Analytics: Fast-Returning Queries
Eric Kavanagh
 
PDF
A Tight Ship: How Containers and SDS Optimize the Enterprise
Eric Kavanagh
 
PDF
Application Acceleration: Faster Performance for End Users
Eric Kavanagh
 
The Future of Data Warehousing and Data Integration
Eric Kavanagh
 
Expediting the Path to Discovery with Multi-Source Analysis
Eric Kavanagh
 
Will AI Eliminate Reports and Dashboards
Eric Kavanagh
 
Metadata Mastery: A Big Step for BI Modernization
Eric Kavanagh
 
Horses for Courses: Database Roundtable
Eric Kavanagh
 
Database Survival Guide: Exploratory Webcast
Eric Kavanagh
 
Better to Ask Permission? Best Practices for Privacy and Security
Eric Kavanagh
 
The Model Enterprise: A Blueprint for Enterprise Data Governance
Eric Kavanagh
 
Best Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Eric Kavanagh
 
A Winning Strategy for the Digital Economy
Eric Kavanagh
 
Discovering Big Data in the Fog: Why Catalogs Matter
Eric Kavanagh
 
Health Check: Maintaining Enterprise BI
Eric Kavanagh
 
Rapid Response: Debugging and Profiling to the Rescue
Eric Kavanagh
 
Solving the Really Big Tech Problems with IoT
Eric Kavanagh
 
Beyond the Platform: Enabling Fluid Analysis
Eric Kavanagh
 
Protect Your Database: High Availability for High Demand Data
Eric Kavanagh
 
A Better Understanding: Solving Business Challenges with Data
Eric Kavanagh
 
The Key to Effective Analytics: Fast-Returning Queries
Eric Kavanagh
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
Eric Kavanagh
 
Application Acceleration: Faster Performance for End Users
Eric Kavanagh
 
Ad

Recently uploaded (20)

PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Python basic programing language for automation
DanialHabibi2
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 

Best Practices in DataOps: How to Create Agile, Automated Data Pipelines

  • 1. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Best Practices in DataOps How to Create Agile, Automated Data Pipelines Wayne W. Eckerson May 8, 2019
  • 2. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com 1. Your data team is flooded with minor request tickets and is burning out. 2. Business users don’t trust the data because it contains too many errors. 3. Source system changes keep breaking your ETL jobs and data pipelines. 4. Business users don’t understand why it takes so long to get data. 5. You have difficulty meeting service level agreements (SLAs). 6. Data analysts write the same jobs and reports with minor variations. 7. Data scientists wait for months for data and computing resources 8. Your company can’t discern the true cost of migrating to the cloud 9. Your data environment is too chaotic to implement predictive analytics 10. Your self-service initiative has spawned hundreds of data silos. Bonus: Your data lake is more of a data swamp. Bonus: It’s takes months to deploy a single predictive model. 10 Symptoms You Need DataOps
  • 3. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com What is DataOps? LeanTQM Agile Dev/Ops • Scrum, Kanban • Business engagement • Self-organizing teams • Retrospectives • Automation • Orchestration • Efficiency • Simplicity • Team-based development • Version control • Continuous integration/ delivery • Test-driven development • Performance management • Performance metrics • Continuous monitoring • Benchmarking DataOps “A set of practices, processes, and technologies for building, operationalizing, automating, and managing data pipelines from source to consumption.” DataOps = Data Operations
  • 4. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com DataOps History DataOps applies rigor of software engineering to the development and execution of data pipelines. “Cowboy Coders” Team-based Development DevOps-based Development 1960s 1970s 1980s 1990s 2000s 2010s 2020s First DevOps event (2009) Manifesto for Agile Software Development published (2001) DevOps DataOps KEY: DataOps Manifesto published (2017) First DataOps Event (2019)
  • 5. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Primary Use Cases Big Data Data Science Self Service Data Warehousing Standardize and reuse core data pipeline components: ingest, transform, clean, etc. Create data science sandboxes on demand; deploy models automatically; monitor data drift. Centralize logic and permissions to facilitate data access and analysis while eliminating data silos Speed development by assigning agile teams to business groups to build end- to-end solutions Agile but ungoverned Governed but not agile Reuse and Collaboration Self Service and Automation Governance and Infrastructure Speed and Prioritization Biggest Needs
  • 6. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Adoption Yes 27% Somewhat 30% No 43% DOES YOUR ORGANIZATION HAVE A DATAOPS INITIATIVE? Based on 175 respondents from an Eckerson Group survey conducted in April, 2019. 32% 29% 10% 9% 7% 6% 5% 2% 1% 0% IT OR BI DIRE C T OR OR MA NA GE R IT OR BI A RC H IT E C T ,… C ONS ULT A NT BUS INE S S MA NA GE R - A NA LY T IC S DA T A A NA LY S T OR S C IE NT IS T BUS INE S S E X E C UT IVE OR … DA T A E NGINE E R A C A DE MIC VE NDOR DA T A OP S E NGINE E R RESPONDENT ROLES 18% 15% 11% 29% 26% VE RY S MA LL < 100 … S MA LL <500 … ME DIUM <1, 000 E MP … LA RGE <10, 000 VE RY LA RGE > … COMPANY SIZE
  • 7. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Benefits Faster cycle time Fewer data defects More scalability, reliability Lower costs More innovation Happier customers Continuous integration/delivery, reuse, automation Test-driven development and execution Team-based development, continuous monitoring Higher development capacity, fewer errors Focus efforts on value-add solutions and technologies Get more for less with greater trust and alignment
  • 8. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Benefits from Survey 60% 55% 50% 50% 48% 47% 47% 42% FA S TE R CY CLE TI ME S HA P P I E R B US I NE S S US E RS DE LI V E R NE W A P P LI CA TI ONS MORE QUI CK LY FE W E R DE FE CTS A ND E RRORS I NGE S T NE W DA TA S OURCE S MORE RA P I DLY FASTER CHANGE REQUESTS I NCRE A S E D DE V E LOP ME NT CA P A CI TY I MP ROV E D DA TA GOV E RNA NCE BENEFITS OF DATAOPS
  • 9. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Challenges 55% 53% 50% 50% 47% 42% 35% 34% 26% 23% E S TA B LI S HING FORMA L P ROCE S S E S ORCHE S TRA TI NG CODE A ND DA TA A CROS S TOOLS STAFF CAPACI TY MONI TORI NG THE E ND -T O-E ND E NV I RONME NT BUI LDI NG RI GOROUS TESTS UPFRONT LA CK OF A DE QUA TE A UTOMA TI ON TOOLS GE TTI NG B US I NE S S US E RS TO B UY I NTO THE P ROCE S S A DOP TI NG A GI LE ME THODS A ND TE A MS DA TA I S TOO HA RD TO FI ND GETTI NG TECHNI CAL USERS TO BUY I N TO THE P ROCE S S DATAOPS CHALLENGES
  • 10. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Components and Tools 58% 54% 53% 50% 50% 46% 46% 41% 38% 32% 28% A GI LE DE V E LOP ME NT CONTI NUOUS DE LI V E RY COLLA B ORA TI ON A ND RE US E CONTI NUOUS I NTE GRA TI ON CODE RE P OS I TORY DATA PI PELI NE ORCHESTRATI ON P E RFORMA NCE A ND A P P LI CA TI ON MONI TORI NG CONTI NUOUS TE S TI NG W ORKFLOW MA NA GE ME NT CHA NGE MA NA GE ME NT RE QUE S T CONTA I NE RS A ND ORCHE S TRA TI ON TOOLS RATE THE IMPORTANCE OF EACH DATAOPS COMPONENT? High
  • 11. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Use Cases 66% 60% 56% 52% 39% 29% 34% 27% DA T A W A RE H OUS E S A ND MA RT S RE P ORT ING A ND DA S H BOA RDING S E LF - S E RVIC E A NA LY S IS DA T A S C IE NC E A ND MA C H INE LE A RNING DA T A LA K E OLA P C UBE S F OR RE P ORT ING A ND A NA LY S IS C US T OME R- F A CING A P P LIC A T IONS A UDIT , C OMP LIA NC E , S E C URIT Y DATAOPS USE CASES
  • 12. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Best Practices Form a data department (with a CDO) Map and assess your data environment Educate your team about DataOps Create cross-functional dev teams Align the teams with business priorities Continuously review and refine processes The “Soft Stuff” If you don’t have one already Add a CDO for executive clout Map data flows; assess waste, inefficiencies, manual processes, error sources, dev capacity. Expect resistance: ”Data is different!” “Don’t slow us down!” Stick with it; “You can’t drive fast w/o brakes.” Self-organizing, cross-trained; collaborative, agile teams that build end-to-end solutions Align agile themes, initiatives, epics, and stories with business goals; get cross-functional priorities It’s a journey; benchmark performance and continuously improve cycle times, capacity, reuse, and other core objectives. Pull ”data people” out of IT; unite data engineers, data scientists, and SW engineers.
  • 13. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Best Practices (cont) Start small and build incrementally Build for reuse Segregate duties and environments Test and monitor everything Use DevOps and DataOps tools Create a self-service infrastructure Build for the enterprise The “Hard Stuff” Standardize ingest, transforms, configurations, code, data sets; use repositories & containers. Use tools to migrate code from dev to test, to production environments and segregate duties Build tests before and after coding; use tests to monitor and automate data pipelines. Repositories for data, code, configurations; tools for agile collaboration, CI/CD, testing, data catalog, orchestration, data glossary, unification. Centralize logic; apply permissions for data access and functionality; automate report and model deployment; serverless, Kubernetes, Plan for security, governance, auditability, scalability, reliability, portability, and continuous monitoring. Insist on business representation on the dev team; get cross-functional priorities monthly
  • 14. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Summary DataOps puts your data on a solid foundation • Speeds cycle time, improves quality, increases capacity, reduces cost Lets your data team focus on value-add • Such as predictive analytics, streaming data, cloud computing Increasing customer satisfaction and business value DataOps is light— out, automated data operations.
  • 15. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Questions? I’m listening!
  • 16. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Wayne Eckerson • 25+ year thought leader in data and analytics • Sought-after speaker and consultant • President, Eckerson Group • Former director of research at TDWI • Author of hundreds of articles and reports Performance Management BI/Analytics
  • 17. © Eckerson Group 2019 Twitter: @weckerson www.eckerson.com Get More Value from Data and Analytics