SlideShare a Scribd company logo
Why Everything You Know About bigdata
Is A Lie
-Delivering Data Driven Business Insights
Adopt
MarketInnovate
Sunil S Ranka
Director – Big Data and Advance Analytics
Key Topics
 About Jade
 About Me
 What is Big Data
 Key Myths
 Why everything is Lie
 Real World Example
 Next Steps
Technology Projects750+
200+ Customers
Referenceable
100%
98%Customer
Retention
500IT professional
worldwide
Services High-Tech Manufacturing Energy Social Media & Entertainment
5 Global Delivery
Centers
8 Offices
Worldwide
Atlanta
Pune
Noida
San Jose
Los Angeles
London
Hyderabad
San Diego
Global Delivery Model Serving Many Industries
Strategic
Partnerships
Salesforce.com
Sales, Service, Marketing,
force.com
Testing
Tools/Frameworks
QC, QTP, Selenium, LoadRunner,
JIRA Bugzilla, JUnit, TestNG
Microsoft
Dynamics, SharePoint,
Office 365, Lync, BI
Custom Development
Java, .Net, J2EE, Product
Engineering, Open Source
Technologies
Integration
Oracle SOA, Tibco, Weblogic,
Oracle Cloud Platform, ICS, JCS,
Mulesoft, Dell Boomi
Infrastructure
Management
IBM AIX, HP-UX, RHEL,
OEL Linux, Windows Server
Cloud Financials, Projects, SCM,
HCM and EBS Financials,
Procurement, Value Chain, CRM,
Demantra, Agile, GRC
Oracle EBS Suite
ServiceNow
IT Service Automation Applications,
CreateNow Development Suite,
Orchestration, Discovery
Big Data & Analytics
Hadoop, KNIME, R, Tableau, Hadoop
Jade Global Clientele (Representative list)
Dilbert On Big Data
During a Data Strategy Session
About Me
• Venture Partner : Investing and Advisor with early stage startups focusing on Data.
• Director – Big Data and Advance Analytics
• Oracle ACE (Business Intelligence with Proficiency in Big Data)
• Extensively worked with fortune 500 leaders.
• Held positions of Head Of Product Development, Architect, etc.
• http://sranka.wordpress.com, sunil_ranka
• Featured Tech writer for IT Next Magazine.
• Speaking engagements at following conferences :
• COLLABORATE ( 2009, 2010 , 2011 ,2012, 2013,2015)
• BIWA SIG TechCast Series (2010 , 2011 , 2012, 2013,2014,2016),
• NorCal OAUG-2010 at Santa Clara Convention Center, CA
• Session speaker at NoCouG in San Francisco
• Oracle Open World ( 2009 , 2010 , 2012)
My Tag Line :: “Superior BI is the antidote to Business Failure”
Why Data Is Important
Data is the new Oil. Data is just like crude. It’s
valuable, but if unrefined it cannot really be used.
– Clive Humby, DunnHumby
11
We have for the first time an economy based on
a key resource [Information] that is not only renewable,
but self-generating. Running out of it is not a problem,
but drowning in it is.
– John Naisbitt
Big Data and Analytics is Helping
Smarter Revenue
Management
Smarter Healtcare
Analytics
$16Billion
Reduced
Improper Payment
Smarter Crime
Prevention
Helps detect life
threatening conditions
up to 24 hours sooner
30%
Cut
serious crime
by
Tax Agency
* Courtesy - IBM
Analytics Maturity Pyramid
No Reporting
Struggling to get basic information
Reactive Analytics
Concerned with current Issues
What Happened ?
Diagnostic Analytics
Hindsight
Why it Happened ?
Predictive Analytics
Insight
What will Happened?
Prescriptive Analytics
Foresight
What should I do ?
What is Big Data
Big data Represents new data features created by today’s Data Driven Organization for Decision
Making
volume
Variety
Velocity
Value
Data At Scale
Terabyte To Petabyte of Data
Data In Many Forms
Structured, unstructured, text, Media
Data In Motion
Analysis of stream data to make decision in real time
Data with Insight
Deriving valuable insight from the data
Characteristicsofbigdata
Harnessing Big Data
 OLTP: Online Transaction Processing (DBMSs)
 OLAP: Online Analytical Processing (Data Warehousing)
 RTAP: Real-Time Analytics Processing (Big Data Architecture & technology)
15
Who’s Generating Big Data
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all the time)
Sensor technology and networks
(measuring all kinds of data)
 The progress and innovation is no longer hindered by the ability to
collect data
 But, by the ability to manage, analyze, summarize, visualize, and discover
knowledge from the collected data in a timely manner and in a scalable
fashion
16
The Model Has Changed…
 The Model of Generating/Consuming Data has
Changed
Old Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming data
17
What are the Myths
Myths
 Big data will change everything.
 Big data means 'a lot' of data
 Data lake is big Data
 Hive can be used for reporting
 Big Data is Only for Large Corporations
 You Need to Hire a Big Data Scientist to Start With Big Data
 Big Data Technology Will Eliminate the Need for Data
Integration
 The only cost for big data is hardware and software.
Myths Continues…
 Big data applications require little or no performance optimization.
 I don’t have enough data for big data.
 Big Data predicts the future.
 Hadoop is the Holy Grail of big data.
 Big data is an IT matter.
 Data warehouses aren’t needed for advanced analytics.
 Hadoop will replace enterprise data warehouses
 With huge volumes of data, small data quality issues are acceptable
What Really works
Big Data Needs Diversified Skill Sets
Math and
Operations Research
Expertise
Develop analytic algorithms
Visualization
Expertise
Interpret data sets,
determine correlations and
present in meaningful ways
Tool Developers
Mask complexity and
analytics to lower skills
boundaries
Industry Vertical
Domain Expertise
Develop hypothesis, identify
relevant business issues,
ask the right questions
Data Experts
Data architecture, management,
governance, policy
Decision Making
Executive and
Management
Apply information to solve
business issues
"By 2015, big data demand will reach 4.4 million jobs globally, but only one-third of those jobs will be filled."
Source: Gartner "Gartner's Top Predictions for IT Organizations and Users, 2013 and Beyond: Balancing Economics, Risk, Opportunity and Innovation" 19 Oct 2012
Industry Implementation Trends
Hybrid Approach
(Large Enterprises)
• Building Hybrid environments as
they want to leverage their existing
investments in their traditional
environments
• Setting up their own internal cloud
environments for security,
regulatory issues as well as to
achieve cloud benefits of simplicity
and elasticity
Migrating Legacy Applications
(Medium Enterprises)
• All new investments are in the
cloud
• Migrating existing on premise to
cloud based on ROI & Business
Objectives
Starting with Cloud
(Small & Startup Enterprises)
• Embracing cloud as they do not
have any legacy systems
Different Phase
• Expand to multiple usecase
• Establish IT SLAs, ROI Metrics and growth Plans
• Expand to more advanced predictive capabilities
• Enable a platform capable of managing greater
volumes and variety of data
• Look to partners to simplify and modernize existing
platform with cost-effective delivery models
• Optimize and integrate apps on converged data
platform
• Establish digital business practices as the new normal
supported by all key executive sponsors
• Provide detailed business SLAs, revenue targets, and
other financial targets
• Normalize data lifecycle/governance, data
monetization, microservice development
• Work with Business and identify usecase
• Commit dedicated resources to development and
operations
• Develop an agile project plan
• Educate business users on analytics
• Accelerate analytics knowledge and skills required to
support to value creation
• Use partners to supplement analytic skills gaps
•Understanding capability of big data ecosystem
•Develop Basic Skills in big Data Management
•Create a Pilot Use Case
•Establish leadership commitment
•Establish working infrastructure
Phase1
(Experimental)
Phase2
(Implementation)
Phase3
(Expansion)
Phase4
(Optimization)
Real World
Data Lake Reference Architecture
Data Lake
Measure
Normalization
and
integration
Master
Metadata
Feature
Surrogate
Keys
Key
Exists
Exception
Handling
Feature DataSet
Customer
Institution
Accounts
Measure Data Set
Key
Accounts
Partnership
Sales
GL
Margins
Derived/Aggregated Fact
Gross Margin
Aggregates
Unified
Customer View
Unified Sales
Views
Unified Partner
Views
Data Staging
Company 1
Data
Company 2
Data
Company 3
Data
Company 4
Data
Predictive Analytics Layer
(Machine Learning)
Predictive Analytics Outcome
- Customer Retention
- Cross Sell Up Sell
- Customer Segmentations
- Customer 360
- Revenue Forecast
- Customer Churn
Reusable
Jade
Connectors
Data Service
Layer
Real-Time
Analytics
Hour/Daily
Report
Weekly/
Monthly
Report
API Layer
Reporting
Layer
Data Lake
Consumption
Zone
Source
System
File Data
OB Data
ETL Extracts
Streaming
Transient
Loading Zone
Raw Data
Refined
Data
Trusted
Data
Discovery
Sandbox
Original unaltered
data attributes
Tokenized Data
APIs
Reference Data Master Data
Data Wrangling
Data Discovery
Exploratory Analytics
Metadata Data Quality Data Catalog Security
Hadoop Data Lake
Integrate to
common format
Data Validation
Data Cleansing
Aggregations
OLP or ODS
Enterprise Data
Warehouse
Logs
(or other unstructured
data)
Cloud Services
Business Analysts
Researchers
Data Scientists
Data Lake Reference Architecture
Where Does Big Data Fit In
Why Everything You Know About bigdata Is A Lie
Analytics Cloud/OnPrem
Data Cloud/OnPrem
Hive Metastore
Elastic Cloud HDFS
Infinite Compute
Hadoop/Spark
Ingest Transform Analyze
External
Dashboards
Internal
Dashboards
Tableau Excel R Zeppelin
Web interface for distributed users
Data set definition
Social metadata dictionary
Export Web interface to dash-
boarding, query, and
data dictionary
Integrated ingestion,
transformation, and
query application for
business analysts
World-class, elastic
Big Data infrastructure
Hybrid Analytics Cloud/On Premises
Analytics Cloud/OnPrem
Analytics Cloud/OnPrem
Hive Metastore
Elastic Cloud HDFS
Infinite Compute
Hadoop/Spark
External
Dashboards
Internal
Dashboards
Tableau Excel R Zeppelin
Web interface for distributed users
Data set definition
Social metadata dictionary
Export Web interface to dash-
boarding, query, and data
dictionary
Integrated ingestion,
transformation, and query
application for business
analysts
World-class, elastic
Big Data infrastructure
Build reports
and
dashboards
Build outgoing
connectors
Ingest Transform Analyze
Business
Analytics, data
science
training
Write ETL and
perform data
engineering
Build
connectors
Hybrid Analytics Cloud/OnPrem
How We Can Help

More Related Content

What's hot (18)

PDF
Analytics3.0 e book
Jyrki Määttä
 
PPTX
Importance of Big data for your Business
azuyo.com
 
PDF
Death of the Dashboard
DATAVERSITY
 
PDF
Lecture on Data Science in a Data-Driven Culture
Johan Himberg
 
PDF
Role of Data in Digital Transformation
VMware Tanzu
 
PDF
Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...
DATAVERSITY
 
PPT
Big data and your career final
Marina Kerbel
 
PDF
Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...
Sandra Fernandes
 
PDF
3D Data Strategy Framework
Daniel Ren
 
PDF
Smarter analytics101 v2.0.1
Jenawahl
 
PPTX
Shane Greenstein Future Assembly 11/17/2015
Adrienne Debigare
 
PPTX
Building an Effective Organizational Analytics Capability
Jeff Crawford
 
PDF
Big data-comes-of-age ema-9sight
Jyrki Määttä
 
PDF
Getting down to business on Big Data analytics
The Marketing Distillery
 
PPTX
Career Prospects and Scope of Data Science in India
achaljain11
 
PPTX
Why Data Science Projects Fail
Sense Corp
 
PDF
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
Ganes Kesari
 
PDF
Data strategy demistifying data
Hans Verstraeten
 
Analytics3.0 e book
Jyrki Määttä
 
Importance of Big data for your Business
azuyo.com
 
Death of the Dashboard
DATAVERSITY
 
Lecture on Data Science in a Data-Driven Culture
Johan Himberg
 
Role of Data in Digital Transformation
VMware Tanzu
 
Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...
DATAVERSITY
 
Big data and your career final
Marina Kerbel
 
Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...
Sandra Fernandes
 
3D Data Strategy Framework
Daniel Ren
 
Smarter analytics101 v2.0.1
Jenawahl
 
Shane Greenstein Future Assembly 11/17/2015
Adrienne Debigare
 
Building an Effective Organizational Analytics Capability
Jeff Crawford
 
Big data-comes-of-age ema-9sight
Jyrki Määttä
 
Getting down to business on Big Data analytics
The Marketing Distillery
 
Career Prospects and Scope of Data Science in India
achaljain11
 
Why Data Science Projects Fail
Sense Corp
 
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
Ganes Kesari
 
Data strategy demistifying data
Hans Verstraeten
 

Similar to Why Everything You Know About bigdata Is A Lie (20)

PPTX
Big Data : From HindSight to Insight to Foresight
Sunil Ranka
 
PDF
Level Seven - Expedient Big Data presentation
Doug Denton
 
PDF
Big data Analytics
ShivanandaVSeeri
 
PDF
Why Big Data is Really about Small Data
Hurwitz & Associates
 
PDF
Big Data Analytics
Sreedhar Chowdam
 
PPTX
Big data insights part i
Raji Gogulapati
 
PDF
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Dr. Cedric Alford
 
PPSX
Intro to Data Science Big Data
Indu Khemchandani
 
PDF
From Big Data to Business Value
Gib Bassett
 
PPTX
Making advanced analytics work for you
Prasunn .
 
PDF
Business with Big data
Bruno Curtarelli
 
PPTX
Data set The Future of Big Data
Data-Set
 
PDF
Big data/Hadoop/HANA Basics
Global Business Solutions SME
 
PDF
Big Data - Everything you need to know
V2Soft
 
PDF
Big Data 2.0
The Marketing Distillery
 
PPTX
Big Data Analytics
Global Business Solutions SME
 
PPTX
Fundamentals of Big Data
The Wisdom Daily
 
DOCX
Bidata
Tamojit Das
 
PDF
Business case for Big Data Analytics
Vijay Rao
 
PDF
Big Data; Big Potential: How to find the talent who can harness its power
Lucas Group
 
Big Data : From HindSight to Insight to Foresight
Sunil Ranka
 
Level Seven - Expedient Big Data presentation
Doug Denton
 
Big data Analytics
ShivanandaVSeeri
 
Why Big Data is Really about Small Data
Hurwitz & Associates
 
Big Data Analytics
Sreedhar Chowdam
 
Big data insights part i
Raji Gogulapati
 
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Dr. Cedric Alford
 
Intro to Data Science Big Data
Indu Khemchandani
 
From Big Data to Business Value
Gib Bassett
 
Making advanced analytics work for you
Prasunn .
 
Business with Big data
Bruno Curtarelli
 
Data set The Future of Big Data
Data-Set
 
Big data/Hadoop/HANA Basics
Global Business Solutions SME
 
Big Data - Everything you need to know
V2Soft
 
Big Data Analytics
Global Business Solutions SME
 
Fundamentals of Big Data
The Wisdom Daily
 
Bidata
Tamojit Das
 
Business case for Big Data Analytics
Vijay Rao
 
Big Data; Big Potential: How to find the talent who can harness its power
Lucas Group
 
Ad

Recently uploaded (20)

PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PPTX
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Complete Network Protection with Real-Time Security
L4RGINDIA
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
July Patch Tuesday
Ivanti
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Complete Network Protection with Real-Time Security
L4RGINDIA
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Ad

Why Everything You Know About bigdata Is A Lie

  • 1. Why Everything You Know About bigdata Is A Lie -Delivering Data Driven Business Insights Adopt MarketInnovate Sunil S Ranka Director – Big Data and Advance Analytics
  • 2. Key Topics  About Jade  About Me  What is Big Data  Key Myths  Why everything is Lie  Real World Example  Next Steps
  • 4. Services High-Tech Manufacturing Energy Social Media & Entertainment 5 Global Delivery Centers 8 Offices Worldwide Atlanta Pune Noida San Jose Los Angeles London Hyderabad San Diego Global Delivery Model Serving Many Industries
  • 5. Strategic Partnerships Salesforce.com Sales, Service, Marketing, force.com Testing Tools/Frameworks QC, QTP, Selenium, LoadRunner, JIRA Bugzilla, JUnit, TestNG Microsoft Dynamics, SharePoint, Office 365, Lync, BI Custom Development Java, .Net, J2EE, Product Engineering, Open Source Technologies Integration Oracle SOA, Tibco, Weblogic, Oracle Cloud Platform, ICS, JCS, Mulesoft, Dell Boomi Infrastructure Management IBM AIX, HP-UX, RHEL, OEL Linux, Windows Server Cloud Financials, Projects, SCM, HCM and EBS Financials, Procurement, Value Chain, CRM, Demantra, Agile, GRC Oracle EBS Suite ServiceNow IT Service Automation Applications, CreateNow Development Suite, Orchestration, Discovery Big Data & Analytics Hadoop, KNIME, R, Tableau, Hadoop
  • 6. Jade Global Clientele (Representative list)
  • 8. During a Data Strategy Session
  • 9. About Me • Venture Partner : Investing and Advisor with early stage startups focusing on Data. • Director – Big Data and Advance Analytics • Oracle ACE (Business Intelligence with Proficiency in Big Data) • Extensively worked with fortune 500 leaders. • Held positions of Head Of Product Development, Architect, etc. • http://sranka.wordpress.com, sunil_ranka • Featured Tech writer for IT Next Magazine. • Speaking engagements at following conferences : • COLLABORATE ( 2009, 2010 , 2011 ,2012, 2013,2015) • BIWA SIG TechCast Series (2010 , 2011 , 2012, 2013,2014,2016), • NorCal OAUG-2010 at Santa Clara Convention Center, CA • Session speaker at NoCouG in San Francisco • Oracle Open World ( 2009 , 2010 , 2012) My Tag Line :: “Superior BI is the antidote to Business Failure”
  • 10. Why Data Is Important
  • 11. Data is the new Oil. Data is just like crude. It’s valuable, but if unrefined it cannot really be used. – Clive Humby, DunnHumby 11 We have for the first time an economy based on a key resource [Information] that is not only renewable, but self-generating. Running out of it is not a problem, but drowning in it is. – John Naisbitt
  • 12. Big Data and Analytics is Helping Smarter Revenue Management Smarter Healtcare Analytics $16Billion Reduced Improper Payment Smarter Crime Prevention Helps detect life threatening conditions up to 24 hours sooner 30% Cut serious crime by Tax Agency * Courtesy - IBM
  • 13. Analytics Maturity Pyramid No Reporting Struggling to get basic information Reactive Analytics Concerned with current Issues What Happened ? Diagnostic Analytics Hindsight Why it Happened ? Predictive Analytics Insight What will Happened? Prescriptive Analytics Foresight What should I do ?
  • 14. What is Big Data Big data Represents new data features created by today’s Data Driven Organization for Decision Making volume Variety Velocity Value Data At Scale Terabyte To Petabyte of Data Data In Many Forms Structured, unstructured, text, Media Data In Motion Analysis of stream data to make decision in real time Data with Insight Deriving valuable insight from the data Characteristicsofbigdata
  • 15. Harnessing Big Data  OLTP: Online Transaction Processing (DBMSs)  OLAP: Online Analytical Processing (Data Warehousing)  RTAP: Real-Time Analytics Processing (Big Data Architecture & technology) 15
  • 16. Who’s Generating Big Data Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data)  The progress and innovation is no longer hindered by the ability to collect data  But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 16
  • 17. The Model Has Changed…  The Model of Generating/Consuming Data has Changed Old Model: Few companies are generating data, all others are consuming data New Model: all of us are generating data, and all of us are consuming data 17
  • 18. What are the Myths
  • 19. Myths  Big data will change everything.  Big data means 'a lot' of data  Data lake is big Data  Hive can be used for reporting  Big Data is Only for Large Corporations  You Need to Hire a Big Data Scientist to Start With Big Data  Big Data Technology Will Eliminate the Need for Data Integration  The only cost for big data is hardware and software.
  • 20. Myths Continues…  Big data applications require little or no performance optimization.  I don’t have enough data for big data.  Big Data predicts the future.  Hadoop is the Holy Grail of big data.  Big data is an IT matter.  Data warehouses aren’t needed for advanced analytics.  Hadoop will replace enterprise data warehouses  With huge volumes of data, small data quality issues are acceptable
  • 22. Big Data Needs Diversified Skill Sets Math and Operations Research Expertise Develop analytic algorithms Visualization Expertise Interpret data sets, determine correlations and present in meaningful ways Tool Developers Mask complexity and analytics to lower skills boundaries Industry Vertical Domain Expertise Develop hypothesis, identify relevant business issues, ask the right questions Data Experts Data architecture, management, governance, policy Decision Making Executive and Management Apply information to solve business issues "By 2015, big data demand will reach 4.4 million jobs globally, but only one-third of those jobs will be filled." Source: Gartner "Gartner's Top Predictions for IT Organizations and Users, 2013 and Beyond: Balancing Economics, Risk, Opportunity and Innovation" 19 Oct 2012
  • 23. Industry Implementation Trends Hybrid Approach (Large Enterprises) • Building Hybrid environments as they want to leverage their existing investments in their traditional environments • Setting up their own internal cloud environments for security, regulatory issues as well as to achieve cloud benefits of simplicity and elasticity Migrating Legacy Applications (Medium Enterprises) • All new investments are in the cloud • Migrating existing on premise to cloud based on ROI & Business Objectives Starting with Cloud (Small & Startup Enterprises) • Embracing cloud as they do not have any legacy systems
  • 24. Different Phase • Expand to multiple usecase • Establish IT SLAs, ROI Metrics and growth Plans • Expand to more advanced predictive capabilities • Enable a platform capable of managing greater volumes and variety of data • Look to partners to simplify and modernize existing platform with cost-effective delivery models • Optimize and integrate apps on converged data platform • Establish digital business practices as the new normal supported by all key executive sponsors • Provide detailed business SLAs, revenue targets, and other financial targets • Normalize data lifecycle/governance, data monetization, microservice development • Work with Business and identify usecase • Commit dedicated resources to development and operations • Develop an agile project plan • Educate business users on analytics • Accelerate analytics knowledge and skills required to support to value creation • Use partners to supplement analytic skills gaps •Understanding capability of big data ecosystem •Develop Basic Skills in big Data Management •Create a Pilot Use Case •Establish leadership commitment •Establish working infrastructure Phase1 (Experimental) Phase2 (Implementation) Phase3 (Expansion) Phase4 (Optimization)
  • 26. Data Lake Reference Architecture Data Lake Measure Normalization and integration Master Metadata Feature Surrogate Keys Key Exists Exception Handling Feature DataSet Customer Institution Accounts Measure Data Set Key Accounts Partnership Sales GL Margins Derived/Aggregated Fact Gross Margin Aggregates Unified Customer View Unified Sales Views Unified Partner Views Data Staging Company 1 Data Company 2 Data Company 3 Data Company 4 Data Predictive Analytics Layer (Machine Learning) Predictive Analytics Outcome - Customer Retention - Cross Sell Up Sell - Customer Segmentations - Customer 360 - Revenue Forecast - Customer Churn Reusable Jade Connectors Data Service Layer Real-Time Analytics Hour/Daily Report Weekly/ Monthly Report API Layer Reporting Layer Data Lake
  • 27. Consumption Zone Source System File Data OB Data ETL Extracts Streaming Transient Loading Zone Raw Data Refined Data Trusted Data Discovery Sandbox Original unaltered data attributes Tokenized Data APIs Reference Data Master Data Data Wrangling Data Discovery Exploratory Analytics Metadata Data Quality Data Catalog Security Hadoop Data Lake Integrate to common format Data Validation Data Cleansing Aggregations OLP or ODS Enterprise Data Warehouse Logs (or other unstructured data) Cloud Services Business Analysts Researchers Data Scientists Data Lake Reference Architecture
  • 28. Where Does Big Data Fit In
  • 30. Analytics Cloud/OnPrem Data Cloud/OnPrem Hive Metastore Elastic Cloud HDFS Infinite Compute Hadoop/Spark Ingest Transform Analyze External Dashboards Internal Dashboards Tableau Excel R Zeppelin Web interface for distributed users Data set definition Social metadata dictionary Export Web interface to dash- boarding, query, and data dictionary Integrated ingestion, transformation, and query application for business analysts World-class, elastic Big Data infrastructure Hybrid Analytics Cloud/On Premises
  • 31. Analytics Cloud/OnPrem Analytics Cloud/OnPrem Hive Metastore Elastic Cloud HDFS Infinite Compute Hadoop/Spark External Dashboards Internal Dashboards Tableau Excel R Zeppelin Web interface for distributed users Data set definition Social metadata dictionary Export Web interface to dash- boarding, query, and data dictionary Integrated ingestion, transformation, and query application for business analysts World-class, elastic Big Data infrastructure Build reports and dashboards Build outgoing connectors Ingest Transform Analyze Business Analytics, data science training Write ETL and perform data engineering Build connectors Hybrid Analytics Cloud/OnPrem
  • 32. How We Can Help

Editor's Notes

  • #12: Oil which is the fuel for modern economy for centuries. However, Oil in its raw form has little value. It needs to be refined and separated into a large number of consumer products, from petrol and kerosene to asphalt and chemical reagents used to make plastics and pharmaceuticals. It is also used in manufacturing a wide variety of materials. Big Data is just like oil, in it’s raw form it provide no value to enterprise, until it is processed and valuable and actionable business insights are “distilled”. Just like the technology that made available 100 years ago to discover oil and process it to consumable products. Big Data technology is going to transform and revolutionize the way enterprise get and use.