SlideShare a Scribd company logo
Grab some
coffee and
enjoy the
pre-show
banter before
the top of the
hour!
The Briefing Room
Crawl, Walk, Run: How to Get Started with Hadoop
Twitter Tag: #briefr The Briefing Room
Welcome
Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com
@eric_kavanagh
Twitter Tag: #briefr The Briefing Room
  Reveal the essential characteristics of enterprise
software, good and bad
  Provide a forum for detailed analysis of today s innovative
technologies
  Give vendors a chance to explain their product to savvy
analysts
  Allow audience members to pose serious questions... and
get answers!
Mission
Twitter Tag: #briefr The Briefing Room
Topics
This Month: HADOOP ECOSYSTEM
February: DATA IN MOTION
January: ANALYTICS
Twitter Tag: #briefr The Briefing Room
The Up Sides of Disruption
….Splice Machine?
Twitter Tag: #briefr The Briefing Room
Analyst: William McKnight
William is President of McKnight Consulting Group. His
clients have included 17 of the Global 2000. Many
clients have gone public with their success story. His
team's implementations have won multiple Best
Practices awards. William is an Entrepreneur of the
Year Finalist, a frequent best practices judge and an
expert witness. He has hundreds of articles and dozens
of white papers in publication. William has also given
numerous keynote presentations worldwide at major
conferences and has given hundreds of public seminars
and webinars. William’s experience includes taking his
company to placement on the Inc. 500 and the Dallas
100 to seller of a multi-million dollar consulting firm.
He is a passionate communicator and motivator, and a
former IT VP of a Fortune 50 company.
Twitter Tag: #briefr The Briefing Room
Splice Machine
  Splice Machine is a SQL-on-Hadoop database
  The product is ACID-compliant and can power both
OLAP and OLTP workloads
  Splice Machine is built on Java-based Apache Derby
and Hbase/Hadoop
Twitter Tag: #briefr The Briefing Room
Guest: Rich Reimer
Rich Reimer, VP of Marketing and Product Management
Rich has over 15 years of sales, marketing and management experience in high-
tech companies. Before joining Splice Machine, Rich worked at Zynga as the
Treasure Isle studio head, where he used petabytes of data from millions of daily
users to optimize the business in real-time. Prior to Zynga, he was the COO and
co-founder of a social media platform named Grouply. Before founding Grouply,
Rich held executive positions at Siebel Systems, Blue Martini Software and Oracle
Corporation as well as sales and marketing positions at General Electric and Bell
Atlantic.
Twitter Tag: #briefr The Briefing Room
Perceptions & Questions
Analyst:
William McKnight
Source: Intel
WHAT HAPPENS IN AN INTERNET MINUTE
FUELED BY DISRUPTIVE TECHNOLOGY FACTORS
Social Media
Cloud Computing
Mobile
Internet of Things
Big Data is the next Natural Resource
“We have for the first time an economy
based on a key resource (Information)
that is not only renewable, but self-generating.
Running out of it is not a problem, but drowning in it is.”
— John Naisbitt
Transactional &
Application Data
Machine Data Social Data Enterprise
Content
• Volume
• Structured
• Throughput
• Velocity
• Structured
• Ingestion
• Variety
• Unstructured
• Veracity
• Variety
• Unstructured
• Volume
BIG DATA IS ADDITIVE TO
EXISTING DATA
IF THIS WERE EASY, EVERYONE WOULD ALREADY
BE LEVERAGING BIG DATA
“Big Data offers big business gains but hidden costs and complexity present
barriers that most organizations will struggle with”
- The Cost of Big Data, Eric Savitz, Forbes 5/2012
§  Big data skills are in short supply
§  Custom built solutions lack integrated management
§  Companies need to get used to the open source nature of the software
that is enhanced by committers
§  Requires integration effort within the existing analytic ecosystem
§  Big data will be less valuable per capita than other data
  Source: 603 global decision-makers involved in business intelligence, data management, and governance initiatives Source:
Forrsights Strategy Spotlight: Business Intelligence And Big Data, Q4 2012
14%
19%
3%
8%
7%
7%
21%
13%
“What best describes your firm’s current usage/plans to adopt big
data technologies and solutions?”
Planning to implement
in more than 1 year
Planning to implement
in the next 12 months
Implemented, not
expanding
Expanding/upgrading
implementation
Average
performers are
thinking about big
data
Top performers
are expanding
their big data
implementations
Rest of
organizations
(<15% growth)
(N = 482)
High performance
(>15% growth)
(N = 58)
TOP PERFORMERS (GREATER THAN 15% ANNUAL GROWTH)
REALIZE THEY NEED MORE
VEHICLES FOR BIG DATA
Data Warehouse
Regional and
Departmental
Views
ADS
Applications
& Engines
Operational
Analytics &
Hot Views
Data Marts
Independent
Dependent
Relational
Data
Conformed
Dimensions
Last
Year
This
Year
Next
Year
THEEVER-EXPANDINGDATAWAREHOUSE
•  Enterprise Data Warehouse users
face huge annual upgrade
expenses
•  To avoid this spend,
organizations are looking for
lower cost alternatives
•  Movement of data to tape not
desired, because data is offline
and not available for analytics
•  Moving infrequently used data to
Hadoop is a cost-effective, online
option that preserves ability to
query
Cost
On the slide with the sad people overwhelming their RDBMS… how do we
know when scale up has become cost prohibitive?
What data should get moved to the data warehouses and data marts and what
data is fine left in the data lake?
Isn’t SQL-on-Hadoop SQL on HDFS?
How is Splice Machine, as a SQL-on-Hadoop solution, giving the ‘best of
both worlds’?
How do you get data with schema into the flat files of HDFS without ‘data
page’ style formatting?
Is the best advantage of SQL-on-Hadoop having the full transformation
capabilities of ETL or ELT on the data?
Is a data lake the best ‘on-ramp’ to big data or is data archival off RDBMS?
QUESTIONS FOR
SPLICE MACHINE
Twitter Tag: #briefr The Briefing Room
Twitter Tag: #briefr The Briefing Room
Upcoming Topics
www.insideanalysis.com
This Month: HADOOP ECOSYSTEM
February: DATA IN MOTION
January: ANALYTICS
Twitter Tag: #briefr The Briefing Room
THANK YOU
for your
ATTENTION!
Some images provided courtesy of
Wikimedia Commons and Wikipedia

More Related Content

What's hot (20)

PDF
IBM Virtual Finance Forum 2016: Top 10 reasons to attend
IBM Analytics
 
PDF
Big Data – From Strategy to Production
Semantic Web Company
 
PDF
Why Alt Data Is So Important
Mostafa Abou Gamrah
 
PPTX
DataScienceConnect Atlanta 2019 - Building Data & Analytics Teams
Juan Gorricho
 
PDF
Data Culture and the Future of Analytics #CIAEX Exchange Jan 2016
Jonathan Woodward
 
PDF
Chief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Craig Milroy
 
PPTX
Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...
Craig Milroy
 
PPTX
Principles of Information Access
Matt Turner
 
PPTX
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
InterCon
 
PDF
Talking to your CEO about the Chief Data Officer Role
Craig Milroy
 
PPTX
Drawing The Line: Will We Ever Have Enough Data?
Sogolytics
 
PDF
Building a Data Driven Organization
IT Weekend
 
PPTX
Big data analytics in banking sector
Anil Rana
 
PDF
Where does Data Democracy begin? [Segment-Synapse, 2019]
aj_cache
 
PDF
Big Data LDN 2017: Building a Data-Driven Culture
Matt Stubbs
 
PDF
Seven Trends in Government Business Intelligence
Tableau Software
 
PPTX
Big Data Real Time Marketring Content Trends
Chase McMichael
 
PDF
9 Great Quotes about Data
Sean Ammirati
 
PDF
2016 Data-Driven Predictions
Christine Astovasadourian
 
PDF
Get Data Smart
Tableau Software
 
IBM Virtual Finance Forum 2016: Top 10 reasons to attend
IBM Analytics
 
Big Data – From Strategy to Production
Semantic Web Company
 
Why Alt Data Is So Important
Mostafa Abou Gamrah
 
DataScienceConnect Atlanta 2019 - Building Data & Analytics Teams
Juan Gorricho
 
Data Culture and the Future of Analytics #CIAEX Exchange Jan 2016
Jonathan Woodward
 
Chief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Craig Milroy
 
Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...
Craig Milroy
 
Principles of Information Access
Matt Turner
 
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
InterCon
 
Talking to your CEO about the Chief Data Officer Role
Craig Milroy
 
Drawing The Line: Will We Ever Have Enough Data?
Sogolytics
 
Building a Data Driven Organization
IT Weekend
 
Big data analytics in banking sector
Anil Rana
 
Where does Data Democracy begin? [Segment-Synapse, 2019]
aj_cache
 
Big Data LDN 2017: Building a Data-Driven Culture
Matt Stubbs
 
Seven Trends in Government Business Intelligence
Tableau Software
 
Big Data Real Time Marketring Content Trends
Chase McMichael
 
9 Great Quotes about Data
Sean Ammirati
 
2016 Data-Driven Predictions
Christine Astovasadourian
 
Get Data Smart
Tableau Software
 

Viewers also liked (20)

ODP
Oral presentation my last summer
mariaport1
 
PDF
Crawl Walk Run to Social Media Success
Emily Davis Consulting
 
PDF
Sermon Slide Deck: "When God Moves Into The Neighbourhood" (John 1:1-18)
New City Church
 
PPTX
Maheshppt1
1MaheshGathe
 
PPSX
Footprints in the Sand
Anna *
 
PDF
OnSite Tageting Strategy
Jonathan Mendez
 
PDF
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Inside Analysis
 
PDF
Splice machine-bloor-webinar-data-lakes
Edgar Alejandro Villegas
 
PPT
Energy Harvesting Shoes - Matching 2012
Emanuele Frontoni
 
PPTX
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
 
PDF
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
iwrigley
 
PPTX
HBaseConEast2016: Splice machine open source rdbms
Michael Stack
 
PDF
Hadoop and the Relational Database: The Best of Both Worlds
Inside Analysis
 
PPTX
Splice Machine Overview
Kunal Gupta
 
PPTX
Smart shoe
slmnsvn
 
PPT
Names of God
Home
 
PDF
SQL on Hadoop
nvvrajesh
 
PPTX
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
Yahoo Developer Network
 
PDF
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Data Con LA
 
PPTX
Undescended Testis- Cryptorchidism
dehdehi
 
Oral presentation my last summer
mariaport1
 
Crawl Walk Run to Social Media Success
Emily Davis Consulting
 
Sermon Slide Deck: "When God Moves Into The Neighbourhood" (John 1:1-18)
New City Church
 
Maheshppt1
1MaheshGathe
 
Footprints in the Sand
Anna *
 
OnSite Tageting Strategy
Jonathan Mendez
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Inside Analysis
 
Splice machine-bloor-webinar-data-lakes
Edgar Alejandro Villegas
 
Energy Harvesting Shoes - Matching 2012
Emanuele Frontoni
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
iwrigley
 
HBaseConEast2016: Splice machine open source rdbms
Michael Stack
 
Hadoop and the Relational Database: The Best of Both Worlds
Inside Analysis
 
Splice Machine Overview
Kunal Gupta
 
Smart shoe
slmnsvn
 
Names of God
Home
 
SQL on Hadoop
nvvrajesh
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
Yahoo Developer Network
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Data Con LA
 
Undescended Testis- Cryptorchidism
dehdehi
 
Ad

Similar to Crawl, Walk, Run: How to Get Started with Hadoop (20)

PDF
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Fred Isbell
 
PDF
Connecting the Dots with Data Mashups
Inside Analysis
 
PDF
I Love APIs Europe 2015: Business Sessions
Apigee | Google Cloud
 
PPTX
Big Data Developer Career Path: Job & Interview Preparation
Intellipaat
 
PPTX
final oracle presentation
Priyesh Patel
 
PDF
David cutler projects and activities
Sales Strategy and Innovation Delivery
 
PDF
Digital Dimensions
Dr. Tathagat Varma
 
PDF
How to Source Digital Initiatives to Drive Revenue Generation
Neo Group Inc
 
PPTX
Capitalize On Social Media With Big Data Analytics
Hassan Keshavarz
 
PPTX
Why Everything You Know About bigdata Is A Lie
Sunil Ranka
 
PDF
DataEd Slides: Approaching Data Management Technologies
DATAVERSITY
 
PDF
Iris and david cutler update
Sales Strategy and Innovation Delivery
 
PDF
The Analytic Platform: Empowering the Business Now
Inside Analysis
 
PDF
Creating an IT Revolution within your Organization - QuickBase, Inc. at CIO V...
QuickBase, Inc.
 
PDF
The Bigger Picture: New Opportunities for the Modern Enterprise
Inside Analysis
 
PDF
Big Data, Big Thinking: Untapped Opportunities
SAP Technology
 
PDF
Are you ready for Big Data 2.0? EMA Analyst Research
Enterprise Management Associates
 
PDF
Big Data Enabled: How YARN Changes the Game
Inside Analysis
 
PDF
AppSphere 15 - Shining a Light on Shadow IT: A New Way of Working for "Busine...
AppDynamics
 
PPTX
Bardess Moderated - Analytics and Business Intelligence - Society of Informat...
bardessweb
 
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Fred Isbell
 
Connecting the Dots with Data Mashups
Inside Analysis
 
I Love APIs Europe 2015: Business Sessions
Apigee | Google Cloud
 
Big Data Developer Career Path: Job & Interview Preparation
Intellipaat
 
final oracle presentation
Priyesh Patel
 
David cutler projects and activities
Sales Strategy and Innovation Delivery
 
Digital Dimensions
Dr. Tathagat Varma
 
How to Source Digital Initiatives to Drive Revenue Generation
Neo Group Inc
 
Capitalize On Social Media With Big Data Analytics
Hassan Keshavarz
 
Why Everything You Know About bigdata Is A Lie
Sunil Ranka
 
DataEd Slides: Approaching Data Management Technologies
DATAVERSITY
 
Iris and david cutler update
Sales Strategy and Innovation Delivery
 
The Analytic Platform: Empowering the Business Now
Inside Analysis
 
Creating an IT Revolution within your Organization - QuickBase, Inc. at CIO V...
QuickBase, Inc.
 
The Bigger Picture: New Opportunities for the Modern Enterprise
Inside Analysis
 
Big Data, Big Thinking: Untapped Opportunities
SAP Technology
 
Are you ready for Big Data 2.0? EMA Analyst Research
Enterprise Management Associates
 
Big Data Enabled: How YARN Changes the Game
Inside Analysis
 
AppSphere 15 - Shining a Light on Shadow IT: A New Way of Working for "Busine...
AppDynamics
 
Bardess Moderated - Analytics and Business Intelligence - Society of Informat...
bardessweb
 
Ad

More from Inside Analysis (20)

PDF
An Ounce of Prevention: Forging Healthy BI
Inside Analysis
 
PDF
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
PDF
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
PDF
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
PDF
To Serve and Protect: Making Sense of Hadoop Security
Inside Analysis
 
PDF
The Hadoop Guarantee: Keeping Analytics Running On Time
Inside Analysis
 
PDF
Introducing: A Complete Algebra of Data
Inside Analysis
 
PDF
The Role of Data Wrangling in Driving Hadoop Adoption
Inside Analysis
 
PDF
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Inside Analysis
 
PDF
All Together Now: Connected Analytics for the Internet of Everything
Inside Analysis
 
PDF
The Biggest Picture: Situational Awareness on a Global Level
Inside Analysis
 
PDF
Structurally Sound: How to Tame Your Architecture
Inside Analysis
 
PDF
SQL In Hadoop: Big Data Innovation Without the Risk
Inside Analysis
 
PDF
The Perfect Fit: Scalable Graph for Big Data
Inside Analysis
 
PDF
A Revolutionary Approach to Modernizing the Data Warehouse
Inside Analysis
 
PDF
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
PDF
Rethinking Data Availability and Governance in a Mobile World
Inside Analysis
 
PDF
DisrupTech - Dave Duggal
Inside Analysis
 
PPTX
Modus Operandi
Inside Analysis
 
PPTX
Phasic Systems - Dr. Geoffrey Malafsky
Inside Analysis
 
An Ounce of Prevention: Forging Healthy BI
Inside Analysis
 
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
Inside Analysis
 
Introducing: A Complete Algebra of Data
Inside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
Inside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Inside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
Inside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
Inside Analysis
 
Structurally Sound: How to Tame Your Architecture
Inside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
Inside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
Inside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
Inside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Inside Analysis
 
DisrupTech - Dave Duggal
Inside Analysis
 
Modus Operandi
Inside Analysis
 
Phasic Systems - Dr. Geoffrey Malafsky
Inside Analysis
 

Recently uploaded (20)

PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 

Crawl, Walk, Run: How to Get Started with Hadoop

  • 1. Grab some coffee and enjoy the pre-show banter before the top of the hour!
  • 2. The Briefing Room Crawl, Walk, Run: How to Get Started with Hadoop
  • 3. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh [email protected] @eric_kavanagh
  • 4. Twitter Tag: #briefr The Briefing Room   Reveal the essential characteristics of enterprise software, good and bad   Provide a forum for detailed analysis of today s innovative technologies   Give vendors a chance to explain their product to savvy analysts   Allow audience members to pose serious questions... and get answers! Mission
  • 5. Twitter Tag: #briefr The Briefing Room Topics This Month: HADOOP ECOSYSTEM February: DATA IN MOTION January: ANALYTICS
  • 6. Twitter Tag: #briefr The Briefing Room The Up Sides of Disruption ….Splice Machine?
  • 7. Twitter Tag: #briefr The Briefing Room Analyst: William McKnight William is President of McKnight Consulting Group. His clients have included 17 of the Global 2000. Many clients have gone public with their success story. His team's implementations have won multiple Best Practices awards. William is an Entrepreneur of the Year Finalist, a frequent best practices judge and an expert witness. He has hundreds of articles and dozens of white papers in publication. William has also given numerous keynote presentations worldwide at major conferences and has given hundreds of public seminars and webinars. William’s experience includes taking his company to placement on the Inc. 500 and the Dallas 100 to seller of a multi-million dollar consulting firm. He is a passionate communicator and motivator, and a former IT VP of a Fortune 50 company.
  • 8. Twitter Tag: #briefr The Briefing Room Splice Machine   Splice Machine is a SQL-on-Hadoop database   The product is ACID-compliant and can power both OLAP and OLTP workloads   Splice Machine is built on Java-based Apache Derby and Hbase/Hadoop
  • 9. Twitter Tag: #briefr The Briefing Room Guest: Rich Reimer Rich Reimer, VP of Marketing and Product Management Rich has over 15 years of sales, marketing and management experience in high- tech companies. Before joining Splice Machine, Rich worked at Zynga as the Treasure Isle studio head, where he used petabytes of data from millions of daily users to optimize the business in real-time. Prior to Zynga, he was the COO and co-founder of a social media platform named Grouply. Before founding Grouply, Rich held executive positions at Siebel Systems, Blue Martini Software and Oracle Corporation as well as sales and marketing positions at General Electric and Bell Atlantic.
  • 10. Twitter Tag: #briefr The Briefing Room Perceptions & Questions Analyst: William McKnight
  • 11. Source: Intel WHAT HAPPENS IN AN INTERNET MINUTE
  • 12. FUELED BY DISRUPTIVE TECHNOLOGY FACTORS Social Media Cloud Computing Mobile Internet of Things Big Data is the next Natural Resource “We have for the first time an economy based on a key resource (Information) that is not only renewable, but self-generating. Running out of it is not a problem, but drowning in it is.” — John Naisbitt
  • 13. Transactional & Application Data Machine Data Social Data Enterprise Content • Volume • Structured • Throughput • Velocity • Structured • Ingestion • Variety • Unstructured • Veracity • Variety • Unstructured • Volume BIG DATA IS ADDITIVE TO EXISTING DATA
  • 14. IF THIS WERE EASY, EVERYONE WOULD ALREADY BE LEVERAGING BIG DATA “Big Data offers big business gains but hidden costs and complexity present barriers that most organizations will struggle with” - The Cost of Big Data, Eric Savitz, Forbes 5/2012 §  Big data skills are in short supply §  Custom built solutions lack integrated management §  Companies need to get used to the open source nature of the software that is enhanced by committers §  Requires integration effort within the existing analytic ecosystem §  Big data will be less valuable per capita than other data
  • 15.   Source: 603 global decision-makers involved in business intelligence, data management, and governance initiatives Source: Forrsights Strategy Spotlight: Business Intelligence And Big Data, Q4 2012 14% 19% 3% 8% 7% 7% 21% 13% “What best describes your firm’s current usage/plans to adopt big data technologies and solutions?” Planning to implement in more than 1 year Planning to implement in the next 12 months Implemented, not expanding Expanding/upgrading implementation Average performers are thinking about big data Top performers are expanding their big data implementations Rest of organizations (<15% growth) (N = 482) High performance (>15% growth) (N = 58) TOP PERFORMERS (GREATER THAN 15% ANNUAL GROWTH) REALIZE THEY NEED MORE
  • 16. VEHICLES FOR BIG DATA Data Warehouse Regional and Departmental Views ADS Applications & Engines Operational Analytics & Hot Views Data Marts Independent Dependent Relational Data Conformed Dimensions
  • 17. Last Year This Year Next Year THEEVER-EXPANDINGDATAWAREHOUSE •  Enterprise Data Warehouse users face huge annual upgrade expenses •  To avoid this spend, organizations are looking for lower cost alternatives •  Movement of data to tape not desired, because data is offline and not available for analytics •  Moving infrequently used data to Hadoop is a cost-effective, online option that preserves ability to query Cost
  • 18. On the slide with the sad people overwhelming their RDBMS… how do we know when scale up has become cost prohibitive? What data should get moved to the data warehouses and data marts and what data is fine left in the data lake? Isn’t SQL-on-Hadoop SQL on HDFS? How is Splice Machine, as a SQL-on-Hadoop solution, giving the ‘best of both worlds’? How do you get data with schema into the flat files of HDFS without ‘data page’ style formatting? Is the best advantage of SQL-on-Hadoop having the full transformation capabilities of ETL or ELT on the data? Is a data lake the best ‘on-ramp’ to big data or is data archival off RDBMS? QUESTIONS FOR SPLICE MACHINE
  • 19. Twitter Tag: #briefr The Briefing Room
  • 20. Twitter Tag: #briefr The Briefing Room Upcoming Topics www.insideanalysis.com This Month: HADOOP ECOSYSTEM February: DATA IN MOTION January: ANALYTICS
  • 21. Twitter Tag: #briefr The Briefing Room THANK YOU for your ATTENTION! Some images provided courtesy of Wikimedia Commons and Wikipedia