SlideShare a Scribd company logo
© Copyright 4/7/2015 BMC Software, Inc1
Robert Stinnett (@robertstinnett)
CARFAX Automation Analyst
October 14, 2014
We Came, We Saw, We
Processed
© Copyright 4/7/2015 BMC Software, Inc2
How #BigData and
#Hadoop integrated into
@BMCControlM at
CARFAX
CARFAX helps millions of people buy, sell
and service their used cars better.
© Copyright 4/7/2015 BMC Software, Inc3
Agenda
1. Workload Automation at CARFAX
2. Big Data and Hadoop Initiative
3. Reduce, Reuse, Recycle
4. Batches, Unite!
© Copyright 4/7/2015 BMC Software, Inc4
Support traditional batch,
data transfer and SLA/SLE
management across
various datacenters.
Everything under one
roof. Integrate with other
software packages to
create an Enterprise wide
workload management
system.
Capacity on demand.
React to the hyper-
growing business.
Workload Automation at CARFAX
1
Manage Integrate Scale
© Copyright 4/7/2015 BMC Software, Inc5
• 129,000+ processes a
day
• 350 batch nodes
• 5 Different
Datacenters
• 1 Unified Workload
Management Platform
© Copyright 4/7/2015 BMC Software, Inc6
Data can be loaded and
live less than 30 minutes
after receipt
1 record to process today,
1 million records to
process tomorrow
Data is what makes
CARFAX who we are. We
can’t afford to “hope we
got it right”.
It’s All About The Data
#Fast #Dynamic #Reliable
© Copyright 4/7/2015 BMC Software, Inc7
Where Does it Come From?
34,000 Data Sources
13 billion records in our VHDB
Data comes in many formats,
even pictures and PDFs!
CARFAX receives data in “any format,
any time, any method”. We process
more data in an hour than many
businesses do in an entire month.
© Copyright 4/7/2015 BMC Software, Inc8
Our business is all about
data. We’ve been doing
Big Data long before it was
“cool”.
It had to integrate with
our currently business
processes. It is a vital part
of our data services and
will provide and consume
data from many other
applications.
We see Hadoop and other
Big Data initiatives
replacing many of our
legacy data processing
systems.
Big Data & Hadoop
2
#BigData #Integration #Future
© Copyright 4/7/2015 BMC Software, Inc9
Hadoop is still evolving
and maturing. Thousands
of pilot projects out
there, but very few
production installations.
We learned we were one
of the pioneers when it
came to integration.
© Copyright 4/7/2015 BMC Software, Inc10
Nobody wanted to
reinvent the wheel. We
didn’t need another
scheduling system.
Hadoop team wanted to
hit he ground running.
Reuse what they were
already familiar with.
Integrate with our existing
DevOps practices across
CARFAX. No silos!
Reuse, Reduce, Recycle
3
#DoMoreWithLess #TimetoMarket #DevOps
© Copyright 4/7/2015 BMC Software, Inc11
Agile Development +
Agile Operations
Increased usage of data
center automation and
configuration
management tools
#DevOps
© Copyright 4/7/2015 BMC Software, Inc12
We’ve managed to free up
our personnel to do
awesome things, and let
the automation handle
the routine stuff.
Integration, more agile
operations, compliance
and remediation. These
are what we see on the
horizon.
This wasn’t a one person,
or one team project. It
was a whole company
initiative. It has been an
amazing journey, yet
we’ve only just begun.
Batches, Unite!
4
#Today #Tomorrow #AmazingJourney
© Copyright 4/7/2015 BMC Software, Inc13
Overall Integration Strategy
Basic Batch
File Transfer Under
Control-M Control
Database Processing,
Web Services
Bladelogic,
Java
Informatica
Start “Run this script”SUMMARY Today “Manage our workloads” Future “Make it all just happen”
20102008 20132004 2012 Beyond
Hadoop,
SAS, ServiceNOW,
“Write Our Own”
© Copyright 4/7/2015 BMC Software, Inc14
#KeyTakeaways
1. Hadoop isn’t a silo, it’s part of other IT
processes.
2. Significantly reduce your learning
curve by using what you already have.
3. Integrations reduce management
headaches.
4. Evolution from batch to workload
© Copyright 4/7/2015 BMC Software, Inc15
@robertstinnett
robertstinnett@
Carfax.com

More Related Content

PPTX
Scaling Data Science on Big Data
DataWorks Summit
 
PDF
Empowering you with Democratized Data Access, Data Science and Machine Learning
DataWorks Summit
 
PPTX
Hadoop Reporting and Analysis - Jaspersoft
Hortonworks
 
PPTX
Designing Data Pipelines for Automous and Trusted Analytics
DataWorks Summit
 
PDF
Get Started Quickly with IBM's Hadoop as a Service
IBM Cloud Data Services
 
PDF
Data-In-Motion Unleashed
DataWorks Summit
 
PDF
Building a Big Data platform with the Hadoop ecosystem
Gregg Barrett
 
Scaling Data Science on Big Data
DataWorks Summit
 
Empowering you with Democratized Data Access, Data Science and Machine Learning
DataWorks Summit
 
Hadoop Reporting and Analysis - Jaspersoft
Hortonworks
 
Designing Data Pipelines for Automous and Trusted Analytics
DataWorks Summit
 
Get Started Quickly with IBM's Hadoop as a Service
IBM Cloud Data Services
 
Data-In-Motion Unleashed
DataWorks Summit
 
Building a Big Data platform with the Hadoop ecosystem
Gregg Barrett
 

What's hot (20)

PPTX
Hadoop for the Masses
DataWorks Summit/Hadoop Summit
 
PPTX
Securing your Big Data Environments in the Cloud
DataWorks Summit
 
PDF
IBM InfoSphere BigInsights for Hadoop: 10 Reasons to Love It
IBM Analytics
 
PPTX
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
HARMAN Services
 
PPTX
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
DataWorks Summit
 
PPTX
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
DataWorks Summit
 
PDF
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
kcmallu
 
PDF
OpenPOWER Update
inside-BigData.com
 
PPTX
Breakout: Hadoop and the Operational Data Store
Cloudera, Inc.
 
PDF
Filling the Data Lake
DataWorks Summit/Hadoop Summit
 
PDF
Democratizing Data Science on Kubernetes
John Archer
 
PPTX
Hadoop in the cloud – The what, why and how from the experts
DataWorks Summit
 
PDF
Delivering Data Science to the Business
DataWorks Summit
 
PDF
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Denodo
 
PDF
Machine Learning Everywhere
DataWorks Summit
 
PPTX
Big Data at your Desk with KNIME
DataWorks Summit/Hadoop Summit
 
PPTX
Big Data in Azure
DataWorks Summit/Hadoop Summit
 
PDF
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
 
PDF
Big SQL: Powerful SQL Optimization - Re-Imagined for open source
DataWorks Summit
 
PPTX
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Precisely
 
Hadoop for the Masses
DataWorks Summit/Hadoop Summit
 
Securing your Big Data Environments in the Cloud
DataWorks Summit
 
IBM InfoSphere BigInsights for Hadoop: 10 Reasons to Love It
IBM Analytics
 
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
HARMAN Services
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
DataWorks Summit
 
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
DataWorks Summit
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
kcmallu
 
OpenPOWER Update
inside-BigData.com
 
Breakout: Hadoop and the Operational Data Store
Cloudera, Inc.
 
Filling the Data Lake
DataWorks Summit/Hadoop Summit
 
Democratizing Data Science on Kubernetes
John Archer
 
Hadoop in the cloud – The what, why and how from the experts
DataWorks Summit
 
Delivering Data Science to the Business
DataWorks Summit
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Denodo
 
Machine Learning Everywhere
DataWorks Summit
 
Big Data at your Desk with KNIME
DataWorks Summit/Hadoop Summit
 
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
 
Big SQL: Powerful SQL Optimization - Re-Imagined for open source
DataWorks Summit
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Precisely
 
Ad

Similar to How Big Data and Hadoop Integrated into BMC ControlM at CARFAX (20)

PPTX
From 30 Days to 30 Minutes -- speeding up the server rollout process at CARFAX
BMC Software
 
PDF
Consumers Demand Data Clouds: Analytics Becoming a Way of Life
BMC Software
 
PDF
Matthew Johnston - Big Data Futures Outlook BCM
Hoi Lan Leong
 
PPTX
Bmc joe goldberg
BigDataExpo
 
PDF
Create your Big Data vision and Hadoop-ify your data warehouse
Jeff Kelly
 
PDF
Big Data Management: A Unified Approach to Drive Business Results
CA Technologies
 
PPT
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Jonathan Seidman
 
PPT
Gartner peer forum sept 2011 orbitz
Raghu Kashyap
 
PDF
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Cynthia Saracco
 
PDF
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Seeling Cheung
 
PPTX
The Modern Data Platform - How to Conquer a New World with Old Problems
DataWorks Summit/Hadoop Summit
 
PDF
2014 10 09 Top reasons to use IBM BigInsights as your Big Data Hadoop system
Toby Woolfe
 
PPTX
Demystify Big Data Breakfast Briefing: Martha Bennett, Forrester
Hortonworks
 
PPTX
Integrate Big Data into Your Organization with Informatica and Perficient
Perficient, Inc.
 
PDF
Bridging the Big Data Gap in the Software-Driven World
CA Technologies
 
PPTX
Leading ServiceOps Transformation Custmer
Sherman37
 
PPTX
big data and cloud computing
Mohamed Sharique Vellikan
 
PPTX
What Drives the Car Business: Moving from Anecdotes to Data
DataWorks Summit
 
PPTX
BMC Control-M for SAP, BPI, and AFT - VPMA - Secret Weapons for a Successful...
BMC Software
 
PDF
Présentation IBM InfoSphere MDM 11.3
IBMInfoSphereUGFR
 
From 30 Days to 30 Minutes -- speeding up the server rollout process at CARFAX
BMC Software
 
Consumers Demand Data Clouds: Analytics Becoming a Way of Life
BMC Software
 
Matthew Johnston - Big Data Futures Outlook BCM
Hoi Lan Leong
 
Bmc joe goldberg
BigDataExpo
 
Create your Big Data vision and Hadoop-ify your data warehouse
Jeff Kelly
 
Big Data Management: A Unified Approach to Drive Business Results
CA Technologies
 
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Jonathan Seidman
 
Gartner peer forum sept 2011 orbitz
Raghu Kashyap
 
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Cynthia Saracco
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Seeling Cheung
 
The Modern Data Platform - How to Conquer a New World with Old Problems
DataWorks Summit/Hadoop Summit
 
2014 10 09 Top reasons to use IBM BigInsights as your Big Data Hadoop system
Toby Woolfe
 
Demystify Big Data Breakfast Briefing: Martha Bennett, Forrester
Hortonworks
 
Integrate Big Data into Your Organization with Informatica and Perficient
Perficient, Inc.
 
Bridging the Big Data Gap in the Software-Driven World
CA Technologies
 
Leading ServiceOps Transformation Custmer
Sherman37
 
big data and cloud computing
Mohamed Sharique Vellikan
 
What Drives the Car Business: Moving from Anecdotes to Data
DataWorks Summit
 
BMC Control-M for SAP, BPI, and AFT - VPMA - Secret Weapons for a Successful...
BMC Software
 
Présentation IBM InfoSphere MDM 11.3
IBMInfoSphereUGFR
 
Ad

More from BMC Software (20)

PDF
The Accelerator's Guide to Digital Transformation
BMC Software
 
PDF
Flip the Switch On Continuous Delivery
BMC Software
 
PDF
Peer Into the Bright Future on the Service Desk Horizon
BMC Software
 
PDF
Remedyforce helps General Dynamics meet ever-changing user needs
BMC Software
 
PDF
BMC Software Remedyforce Case Study
BMC Software
 
PDF
Mission: Launch a Digital Workplace
BMC Software
 
PPTX
How Will Your Cloud Strategy Impact Your Cyber Strategy?
BMC Software
 
PDF
The Power of Monitoring Studio in TrueSight
BMC Software
 
PDF
MasterCard Optimizes Big Data Management with BMC High Speed Utilities for DB2®
BMC Software
 
PPTX
Digital Transformation Playbook: Guide to Unleashing Exponential Growth
BMC Software
 
PPTX
Salesforce Lightning Process Builder IS the next-generation workflow tool
BMC Software
 
PPTX
Remedyforce Localization and Translation
BMC Software
 
PDF
What Do Executives Need to Do to Go Digital?
BMC Software
 
PDF
Curating Your Digital Workplace: Key Steps for IT
BMC Software
 
PDF
Delivering the Digital Workplace Without the Chaos
BMC Software
 
PPTX
Salesforce and Remedyforce ISV Tech Talk: Pushing New Versions of your App
BMC Software
 
PPTX
Data Migration for Remedyforce SaaS Help Desk and High-Speed Digital Service ...
BMC Software
 
PPTX
Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...
BMC Software
 
PDF
Next Generation Technology Utility Benchmarks
BMC Software
 
PPTX
How to Manage MLC Costs to Optimize the Mainframe
BMC Software
 
The Accelerator's Guide to Digital Transformation
BMC Software
 
Flip the Switch On Continuous Delivery
BMC Software
 
Peer Into the Bright Future on the Service Desk Horizon
BMC Software
 
Remedyforce helps General Dynamics meet ever-changing user needs
BMC Software
 
BMC Software Remedyforce Case Study
BMC Software
 
Mission: Launch a Digital Workplace
BMC Software
 
How Will Your Cloud Strategy Impact Your Cyber Strategy?
BMC Software
 
The Power of Monitoring Studio in TrueSight
BMC Software
 
MasterCard Optimizes Big Data Management with BMC High Speed Utilities for DB2®
BMC Software
 
Digital Transformation Playbook: Guide to Unleashing Exponential Growth
BMC Software
 
Salesforce Lightning Process Builder IS the next-generation workflow tool
BMC Software
 
Remedyforce Localization and Translation
BMC Software
 
What Do Executives Need to Do to Go Digital?
BMC Software
 
Curating Your Digital Workplace: Key Steps for IT
BMC Software
 
Delivering the Digital Workplace Without the Chaos
BMC Software
 
Salesforce and Remedyforce ISV Tech Talk: Pushing New Versions of your App
BMC Software
 
Data Migration for Remedyforce SaaS Help Desk and High-Speed Digital Service ...
BMC Software
 
Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...
BMC Software
 
Next Generation Technology Utility Benchmarks
BMC Software
 
How to Manage MLC Costs to Optimize the Mainframe
BMC Software
 

Recently uploaded (20)

PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Doc9.....................................
SofiaCollazos
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Software Development Methodologies in 2025
KodekX
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 

How Big Data and Hadoop Integrated into BMC ControlM at CARFAX

  • 1. © Copyright 4/7/2015 BMC Software, Inc1 Robert Stinnett (@robertstinnett) CARFAX Automation Analyst October 14, 2014 We Came, We Saw, We Processed
  • 2. © Copyright 4/7/2015 BMC Software, Inc2 How #BigData and #Hadoop integrated into @BMCControlM at CARFAX CARFAX helps millions of people buy, sell and service their used cars better.
  • 3. © Copyright 4/7/2015 BMC Software, Inc3 Agenda 1. Workload Automation at CARFAX 2. Big Data and Hadoop Initiative 3. Reduce, Reuse, Recycle 4. Batches, Unite!
  • 4. © Copyright 4/7/2015 BMC Software, Inc4 Support traditional batch, data transfer and SLA/SLE management across various datacenters. Everything under one roof. Integrate with other software packages to create an Enterprise wide workload management system. Capacity on demand. React to the hyper- growing business. Workload Automation at CARFAX 1 Manage Integrate Scale
  • 5. © Copyright 4/7/2015 BMC Software, Inc5 • 129,000+ processes a day • 350 batch nodes • 5 Different Datacenters • 1 Unified Workload Management Platform
  • 6. © Copyright 4/7/2015 BMC Software, Inc6 Data can be loaded and live less than 30 minutes after receipt 1 record to process today, 1 million records to process tomorrow Data is what makes CARFAX who we are. We can’t afford to “hope we got it right”. It’s All About The Data #Fast #Dynamic #Reliable
  • 7. © Copyright 4/7/2015 BMC Software, Inc7 Where Does it Come From? 34,000 Data Sources 13 billion records in our VHDB Data comes in many formats, even pictures and PDFs! CARFAX receives data in “any format, any time, any method”. We process more data in an hour than many businesses do in an entire month.
  • 8. © Copyright 4/7/2015 BMC Software, Inc8 Our business is all about data. We’ve been doing Big Data long before it was “cool”. It had to integrate with our currently business processes. It is a vital part of our data services and will provide and consume data from many other applications. We see Hadoop and other Big Data initiatives replacing many of our legacy data processing systems. Big Data & Hadoop 2 #BigData #Integration #Future
  • 9. © Copyright 4/7/2015 BMC Software, Inc9 Hadoop is still evolving and maturing. Thousands of pilot projects out there, but very few production installations. We learned we were one of the pioneers when it came to integration.
  • 10. © Copyright 4/7/2015 BMC Software, Inc10 Nobody wanted to reinvent the wheel. We didn’t need another scheduling system. Hadoop team wanted to hit he ground running. Reuse what they were already familiar with. Integrate with our existing DevOps practices across CARFAX. No silos! Reuse, Reduce, Recycle 3 #DoMoreWithLess #TimetoMarket #DevOps
  • 11. © Copyright 4/7/2015 BMC Software, Inc11 Agile Development + Agile Operations Increased usage of data center automation and configuration management tools #DevOps
  • 12. © Copyright 4/7/2015 BMC Software, Inc12 We’ve managed to free up our personnel to do awesome things, and let the automation handle the routine stuff. Integration, more agile operations, compliance and remediation. These are what we see on the horizon. This wasn’t a one person, or one team project. It was a whole company initiative. It has been an amazing journey, yet we’ve only just begun. Batches, Unite! 4 #Today #Tomorrow #AmazingJourney
  • 13. © Copyright 4/7/2015 BMC Software, Inc13 Overall Integration Strategy Basic Batch File Transfer Under Control-M Control Database Processing, Web Services Bladelogic, Java Informatica Start “Run this script”SUMMARY Today “Manage our workloads” Future “Make it all just happen” 20102008 20132004 2012 Beyond Hadoop, SAS, ServiceNOW, “Write Our Own”
  • 14. © Copyright 4/7/2015 BMC Software, Inc14 #KeyTakeaways 1. Hadoop isn’t a silo, it’s part of other IT processes. 2. Significantly reduce your learning curve by using what you already have. 3. Integrations reduce management headaches. 4. Evolution from batch to workload
  • 15. © Copyright 4/7/2015 BMC Software, Inc15 @robertstinnett robertstinnett@ Carfax.com