Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
JANET DORENKOTT, BIO 
• Over 20 years of experience in information technology. 
• Founded Relational Solutions in 1996 and co-owns with Rob York. 
• Focused on data warehousing, data integration & business intelligence solutions 
• Specialize in the complex issues associated with integrating point of sale and syndicated data 
for the CPG industry & developed applications including POSmart and BlueSky, designed for 
handling data complexities unique to CPG companies. 
• Member of Retailwire’s Braintrust 
• Founder of the Demand Signal Repository Institute on LinkedIn. 
• Participated in the implementation of over 200 data warehouse and BI projects for companies 
that include Chrysler, Chase, Timken, Xerox, Glaxo, Smuckers, P&G and many others. 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
GOALS FOR TODAY 
• TO DEFINE BIG DATA 
• EXPLAIN HOW BIG DATA CAN IMPROVE BUSINESS 
• EXPLAIN HOW TO USE IT 
• SHOW THE IMPORTANCE OF LEVERAGING SOCIAL MEDIA
“Top 10 
“Companies 
on the Move” 
BlueSky 
Integration 
Studio 
“Best at 
integrating POS 
with Internal 
data” 
Cleveland 
Weatherhead 100 
Fastest Growing 
Businesses 
Oracle 
Developer of 
the Year 
Data Warehouse 
& BI Consulting 
1996 - 98 1999 - 01 2002 – 04 2005 - 06 2007 - 08 2009 - 10 2011 – 12 2013 
“Data Warehouse 
of the Year!” 
BlueSky 
“Coolest New 
Technologies” 
DataStage 
ETL Best 
Implementors 
Award 
Informatica’s 
Partner of the 
Year 
Selects BIS to 
integrate POS & 
TradeEdge 
Selects 
POSmart to 
embed in DSR 
Best Software” 
Finalist 
BIG DATA… IT’S IN OUR BLOOD! 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
BUSINESS INTELLIGENCE 
• Leverages data to provide users with “Fact Based Decision” capability. 
• Derived from an enterprise data warehouse for management decisions 
• Reports are also derived from “stove pipe” solutions, ERP applications and homemade 
integration processes. 
• Operational reports are not the same as Analytical reports. 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
TRANSACTIONAL VS. ANALYTICAL REPORTING 
TRANSACTIONAL SYSTEM 
• DATABASE STRUCTURE DESIGNED FOR 
DATA ENTRY, UPDATE, AND PROCESSING. 
• OPERATIONAL REPORTS. 
• REPORTING USERS CAN IMPACT 
PROCESSING - QUICKLY BECOMES A SLOW 
ENVIRONMENT 
• PURCHASED APPLICATIONS CONTAIN 
STANDARD REPORTS 
• INCONSISTENT DUE TO “TWINKLING” 
• NO ACCESS TO SOME INFO 
• REPORTS CAN TAKE DAYS OR BE 
IMPOSSIBLE TO GET 
• NORMALIZED MODEL FOR FAST INPUT 
DATA WAREHOUSE 
• DATA MODEL DESIGNED FOR ANALYTICAL 
REPORTING AND AD-HOC QUERIES, BOTH 
FROM A CREATION AND A PERFORMANCE 
STANDPOINT 
• FREQUENTLY CONTAINS DETAIL DATA AND 
PRE-AGGREGATED SUMMARIES FOR FAST 
REPORTING 
• TOOLS ALLOW END USERS TO INQUIRE, 
DRILL FROM SUMMARY TO DETAIL 
• REPORTING USERS DO NOT IMPACT THE 
TRANSACTIONAL SYSTEM 
• OFTEN COMBINES DATA FROM MULTIPLE 
TRANSACTIONAL SYSTEMS 
• CONSISTENT – BUSINESS RULES 
• TYPICALLY DENORMALIZED 
Data 
Mart 
Transactional 
System 
e.g. 
SAP 
JDE 
Oracle Apps 
JDA 
Homegrown 
Data 
Mart 
Data 
Mart 
Data 
Mart 
Data 
Mart 
Data 
Mart 
Periodic Data Feeds 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
BIG DATA STARTED WITH ERP AND DATA WAREHOUSING 
• DATA MART: FOCUSED 
COLLECTION OF SIMILAR DATA 
FOR REPORTING PURPOSES 
Sales 
Data Mart 
Finance 
Data Mart 
Forecasting 
Data Mart 
International Sales 
Data Mart 
Vendor Information 
Data Mart 
 DATA WAREHOUSE: 
INTEGRATION OF MULTIPLE 
DATA MARTS INTO AN 
ENTERPRISE SOLUTION 
Marketing 
Data Mart 
Common 
Reference 
Values 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
THE BIG DATA EXPLOSION! 
Accounting 
Shipments 
Order 
Processing 
Manufacturing 
Transactional/ERP 
Analytical 
Big Data 
Currency Conversion 
Weather Trends 
SMS/MSS 
Photo’s 
Syndicated Data 
Web & Outside Data Sources 
EDW 
CRM 
Loyalty 
Segmentation 
Panel Data 
Wholesaler, Distributor 
& Broker Data 
Promotion Results 
Web Logs 
EDI 
Retailer POS Web Logs 
3rd Party Data 
Click Stream 
Audio 
Textual Content 
Video 
Reputation 
Management 
Social Media 
Chatter 
Blogs 
Location Info 
3-D Content 
Schmatics 
Geo-Spacial 
Speech to 
Text 
Demographics 
Emerging Market 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
WHAT’S THE DIFFERENCE? 
Un-Structured 
• Social Media 
• Chatter, Text 
Analytics, Blogs, 
Tweets, Comments, 
Likes, Followers, 
Social Authority, 
Clicks, Tags, etc. 
• Digital, Video 
• Audio 
• Geo-Spacial 
Multi-Structured 
/Hybrid 
• Emerging Market Data 
• Loyalty 
• E-Commerce 
• Other Third Party Data 
• Weather 
• Currency Conversion 
• Demographic 
• Panel 
• POS, POL, IR, EDI, RFID, NFC, QR, 
IRI, Rsi, Nielsen, Other 
Syndicated, IMS, MSA, etc. 
Structured 
ERP & DW 
• Main Frame 
• SQL Server 
• Oracle 
• DB2 
• Sybase 
• Access, Excel, txt, etc 
• Teradata 
• Neteeza, Other mpp 
• SAP, JDE, JDA, Other ERP. 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
VOLUME! 
0 
0 
0 
0 
0 
0 
0 
0 0 
0 
0 
0 
0 0 
0 
1 
0 1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
IT’S NOT JUST SIZE , 
VARIETY! 
EDI 
RFID 
SAP 
DB2 
Oracle 
TXT 
SQL 
AS2 
CRM 
TPO JDE 
QR 
ACESS 
Mobile 
EXCEL 
NPD 
IMS 
TPM 
E-Comerce 
CRM 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
IT’S NOT JUST VOLUME & VARIETY! 
VELOCITY MATTERS! 
• Daily 
• Weekly 
• Monthly 
• Quarterly 
• Annually 
• Every Hour 
• Every Minute 
• Every Second 
• Every Nano-Second! 
• Constantly Changing 
• Constantly Growning! 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
IT’S NOT JUST VOLUME & VARIETY & VELOCITY. 
COMPLEXITY! 
• Aligning Hierarchy’s 
• Integrating Internal Master Data with Retailer Master Data 
• Applying Various Calendars 
• Regional Territories 
• Geographic alignment 
• Currency Conversion 
• Emerging Market 
• Loyalty 
• Market Basket 
• Cleansing Issues 
• Re-cast Data 
• Slowly Changing Dimensions (how you want to handle 
history, new stores, etc). 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
WHAT IS HADOOP? 
•HADOOP IS AN OPEN SOURCE DATA LIBRARY WITH 2 KEY COMPONENTS: 
1. DISTRIBUTED FILE SYSTEM (HDFS) – FOR HIGH BANDWIDTH, CLUSTER BASED STORAGE 
2. DATA PROCESSING FRAMEWORK – USES “MAPREDUCE” TO DISTRIBUTE/MAP LARGE DATA SETS ACROSS 
MULTIPLE SERVERS. EACH SERVER CREATES A SUMMARY OF THE DATA THAT HAS BEEN ALLOCATED TO IT. FROM 
THERE, DATA IS “REDUCED” OR “AGGREGATED.” SIMPLY PUT, IT IS MAPPED, THEN REDUCED. 
“HADOOP LETS YOU DEAL WITH VOLUME, VELOCITY AND VARIETY OF DATA. IT TRANSFORMS COMMODITY 
HARDWARE AND PROVIDES AUTOMATIC FAILOVER.” 
OWEN O’MALLEY, ARCHITECT FOR MAPREDUCE & SECURITY. 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
WHAT IS MAPREDUCE? 
• A PARALLEL PROGRAMMING FRAMEWORK 
• MADE POPULAR BY GOOGLE 
• GENERATE SEARCH INDEXES 
• WEB SCORING ALGORITHMS 
• C++, JAVA, PYTHON, ETC. 
• HARNESS 1000S OF CPUS 
• MAPREDUCE PROVIDES 
• AUTOMATIC PARALLELIZATION 
• FAULT TOLERANCE 
• MONITORING & STATUS UPDATES 
“MAPREDUCE ALLOWS PROGRAMMERS 
WITHOUT ANY EXPERIENCE WITH PARALLEL 
AND DISTRIBUTED SYSTEMS TO EASILY 
UTILIZE THE RESOURCES OF A LARGE 
DISTRIBUTED SYSTEM.” 
- JEFFREY DEAN AND SANJAY GHEMAWAT, 
GOOGLE, INC., 2004 
Map Function 
Scheduler 
Results 
map 
shuffle 
reduce 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
MAPREDUCE IS SIMPLE WORD COUNT 
Unstructured 
Data Input 
Boat Yacht Lake 
House House Lake 
Boat House Yacht 
Fish Fish Fish 
Splitting Mapping Shuffling Reducing Result 
Boat Yacht Lake 
House House Lake 
Boat House Yacht 
Fish Fish Fish 
Boat, 1 
Yacht, 1 
Lake, 1 
House, 1 
House, 1 
Lake, 1 
Boat, 1 
House, 1 
Yacht, 1 
Fish, 1 
Fish, 1 
Fish, 1 
Boat, 1 
Boat, 1 
Yacht, 1 
Yacht, 1 
Lake, 1 
Lake, 1 
House, 1 
House, 1 
House, 1 
Fish, 1 
Fish, 1 
Fish, 1 
Boat, 2 
Yacht, 2 
Lake, 2 
House, 3 
Fish, 3 
Boat, 2 
Yacht, 2 
Lake, 2 
House, 3 
Fish, 3 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
COMMON TERMINOLOGY 
• PIG – HIGH LEVEL LANGUAGE THAT CONVERTS WORK TO MAPREDUCE 
• HIVE – TRANSFORMS & CONVERTS TO MAPREDUCE USING SQL 
• HBASE – SCALABLE, DISTRIBUTED DATABASE. PROVIDES A SIMPLE INTERFACE TO 
DATA (I.E. FACEBOOK MESSAGES UTILIZE THIS) 
• ZOOKEEPER – PROVIDES COORDINATION FOR SERVERS 
• HCATALOG – METADATA PULLED OUT OF HIVE 
• MAHOUT – MACHINE LEARNING LIBRARY 
• SCOOP – TOOL TO RUN MAPREDUCE APPS THAT PULL OR PUSH OUT OF SQL OR 
ORACLE 
• CASCADE – TRANSLATES DOWN INTO MAPREDUCE 
• OOZIE – WORKFLOW COORDINATION TO LEARN MAPREDUCE JOBS 
• FUSE DFS – USED TO ACCESS LINUX FILES 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
HOW CAN BIG DATA BE USED? 
• BIG DATA CAN BE USED TO MICRO-SEGMENT 
CUSTOMERS, ANALYZE SENTIMENT, PREDICT 
BEHAVIOR, PERSONALIZE OFFERS, CROSS-SELL 
AND UPSELL ACROSS CHANNELS, MANAGE 
REPUTATION, INCREASE SALE AND PROFITS. 
• COMPANIES NEED TO “WALK BEFORE YOU RUN.” 
• THE “BUILD IT & THEY WILL COME” PHILOSOPHY 
RARELY WORKS. IDENTIFY A BUSINESS NEED. 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
SOCIAL MEDIA REQUIRES YOU TO 
LISTEN 
ENGAGE 
INFORM 
OFFER 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
LEVERAGING THE DATA MEANS YOU NEED TO 
ACCESS 
ANALYZE 
ACT 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
IS SOCIAL MEDIA REALLY WORTH 
LEVERAGING? 
ACCORDING TO THE PEW RESEARCH CENTER: 
• 100 MILLION ACTIVE USERS 
• 50 MILLION LOG ON TO TWITTER EVERYDAY 
• 55% ARE MOBILE USERS 
------------------------------------------- 
• AVERAGE TWEETS SENT PER DAY (IN MILLIONS): 
• IN JANUARY, 2010 – 50 MILLION TWEETS PER SECOND 
• IN FEBRUARY, 2011 – 140 MILLION TWEETS PER SECOND 
• IN SEPTEMBER, 2011 – 230 MILLION TWEETS PER SECOND 
• There were 2.5 million tweets regarding Steve Jobs’ 
death in the first 13 hours after it was reported, which is 
about 53 tweets per second. 
• 6,939 Tweets per second in Japan on New Years Eve at 
Midnight 
According to McKinsey Global Institute: 
• Facebook – 700,000,000,000 minutes spent/month 
• Google – 34,000 search/sec 
• Email – 838,000,000 messages in 2013 
• Twitter – 500,000,000 tweets/day 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
IT’S ONLY JUST BEGUN! 
• LINKEDIN 
• FACEBOOK 
• YOUTUBE 
• SLIDESHARE 
• BRIGHTTALK.COM 
• SCRIBED 
• NAYMZ 
• JIGSAW 
• SPOKE 
• G+ 
• TWITTER 
• VINE 
• INSTAGRAM 
• BING 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
UNDERSTAND YOUR INTERACTIONS 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
KNOW YOUR SOCIAL REPUTATION 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
KNOW WHERE YOUR SENTIMENT IS COMING FROM 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
SEE WHERE YOUR CHAMPIONS ARE 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
UNDERSTAND WHERE YOU NEED DAMAGE CONTROL 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
WHAT ARE YOUR FOLLOWERS SAYING 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
GOALS FOR TODAY – ACCOMPLISHED! 
• TO DEFINE BIG DATA – VOLUME, VARIETY, VELOCITY & COMPLEXITY 
• EXPLAIN HOW BIG DATA CAN IMPROVE BUSINESS – LISTEN, ENGAGE, INFORM & OFFER 
• EXPLAIN HOW TO USE IT – LEVERAGING A FOUNDATION 
• SHOW THE IMPORTANCE OF LEVERAGING SOCIAL MEDIA – INTEGRATE WITH OTHER DATA
THANK YOU & STAY TUNED! 
• FOLLOW JANET DORENKOTT ON LINKEDIN, EMAIL JANETD@RELATIONALSOLUTIONS.COM 
• CALL US AT 440-899-3296, JANET IS X225 / KAREN IS X 232 
• FOLLOW RELATIONAL SOLUTIONS ON LINKEDIN, TWITTER @POSMARTBLUESKY & ON 
FACEBOOK 
• JOIN OUR “DEMAND SIGNAL REPOSITORY INSTITUTE” & “BIG DATA ASSOCIATION” GROUP ON 
LINKEDIN 
• SUBSCRIBE TO THE RELATIONAL SOLUTIONS CHANNEL ON YOUTUBE: 
• RELATIONAL SOLUTIONS CHANNEL 
• VISIT US AT WWW.RELATIONALSOLUTIONS.COM OR CALL 440-899-3296 X225 
• LEARN MORE FROM OUR WEBINARS & DOWNLOAD OUR WHITEPAPERS 
• SEE PRODUCT DEMO’S & DOWNLOAD TRIALS FROM OUR WEBSITE 
Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,

More Related Content

PDF
Big data Whitepaper
PPTX
Hadoop Data Modeling
PDF
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
PDF
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
PDF
Using Machine Learning to Understand and Predict Marketing ROI
PDF
Applications of AI in Supply Chain Management: Hype versus Reality
PDF
Big Data Analytics Architecture PowerPoint Presentation Slides
PPTX
IDERA Slides: Managing Complex Data Environments
Big data Whitepaper
Hadoop Data Modeling
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Using Machine Learning to Understand and Predict Marketing ROI
Applications of AI in Supply Chain Management: Hype versus Reality
Big Data Analytics Architecture PowerPoint Presentation Slides
IDERA Slides: Managing Complex Data Environments

What's hot (20)

PDF
Death of the Dashboard
PDF
Big data and you
 
PDF
Next generation Data Governance
PDF
Unlocking the Value of Your Data Lake
PDF
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
PDF
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
PDF
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
PDF
Understanding big data and data analytics big data
PDF
Using Data Platforms That Are Fit-For-Purpose
PPTX
Big data, big revenue
PPTX
Cloud and Analytics -- 2020 sparksummit
PDF
Lessons in Data Modeling: Data Modeling & MDM
PDF
Data Lake Architecture – Modern Strategies & Approaches
PDF
Big Data at a Glance
PDF
Data Modeling for Big Data
PDF
Business case for Big Data Analytics
PPTX
Business Analytics & Big Data Trends and Predictions 2014 - 2015
PDF
ADV Slides: Comparing the Enterprise Analytic Solutions
PDF
Using Big Data Smarter Decision Making
PDF
Getting down to business on Big Data analytics
Death of the Dashboard
Big data and you
 
Next generation Data Governance
Unlocking the Value of Your Data Lake
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Understanding big data and data analytics big data
Using Data Platforms That Are Fit-For-Purpose
Big data, big revenue
Cloud and Analytics -- 2020 sparksummit
Lessons in Data Modeling: Data Modeling & MDM
Data Lake Architecture – Modern Strategies & Approaches
Big Data at a Glance
Data Modeling for Big Data
Business case for Big Data Analytics
Business Analytics & Big Data Trends and Predictions 2014 - 2015
ADV Slides: Comparing the Enterprise Analytic Solutions
Using Big Data Smarter Decision Making
Getting down to business on Big Data analytics
Ad

Viewers also liked (15)

PDF
SAP HANA & HADOOP Implementation - Predictive Analytics – CPG and Retail on U...
PDF
Displaying your Brand
PDF
Application of Decision Sciences to Solve Business Problems in the Consumer P...
PDF
Creating Business Value - Use Cases in CPG/Retail
PPTX
Presentation on Big Data Analytics
PDF
Spatial Processing with SAP HANA
PPTX
LL Q2 Merchandising Strategy
PDF
CPG Innovation From Ideation to Aisle: New Techniques for Staying Ahead of Co...
PPSX
Retail cpg travel and logistics
PDF
Analytics for CPG Industry_Marketelligent
PPTX
Big Data Analytics Strategy and Roadmap
PDF
CPG Companies: Evolving Your Analytics-driven Organizations
PPTX
Use of Analytics in Procurement
PDF
Big Data in Retail - Examples in Action
PPTX
Big data ppt
SAP HANA & HADOOP Implementation - Predictive Analytics – CPG and Retail on U...
Displaying your Brand
Application of Decision Sciences to Solve Business Problems in the Consumer P...
Creating Business Value - Use Cases in CPG/Retail
Presentation on Big Data Analytics
Spatial Processing with SAP HANA
LL Q2 Merchandising Strategy
CPG Innovation From Ideation to Aisle: New Techniques for Staying Ahead of Co...
Retail cpg travel and logistics
Analytics for CPG Industry_Marketelligent
Big Data Analytics Strategy and Roadmap
CPG Companies: Evolving Your Analytics-driven Organizations
Use of Analytics in Procurement
Big Data in Retail - Examples in Action
Big data ppt
Ad

Similar to Big data why big data is huge for CPG manufacturers (20)

PPTX
What is a Demand Signal Repository?
PDF
Incorporating the Data Lake into Your Analytic Architecture
PDF
Hadoop 2.0: YARN to Further Optimize Data Processing
PPT
Big Data Analytics Materials, Chapter: 1
PPTX
Architecting for Big Data: Trends, Tips, and Deployment Options
PDF
Total Data Industry Report
PDF
BAR360 open data platform presentation at DAMA, Sydney
PDF
Level Seven - Expedient Big Data presentation
PPTX
Big Data: Setting Up the Big Data Lake
DOC
Big Data Analyst at BankofAmerica
PPTX
Big Data, NoSQL, NewSQL & The Future of Data Management
PDF
Omnichannel Challenges and Resolutions
PDF
Simplifying Data Interoperability with Geo Addressing and Enrichment
PDF
Reinvent Your Data Management Strategy for Successful Digital Transformation
PPTX
PPTX
Creating a Data Driven Organization - StampedeCon 2016
PPTX
Big Data Analytics with Microsoft
PDF
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
PDF
What is Big Data?
PPTX
Big-Data-Seminar-6-Aug-2014-Koenig
What is a Demand Signal Repository?
Incorporating the Data Lake into Your Analytic Architecture
Hadoop 2.0: YARN to Further Optimize Data Processing
Big Data Analytics Materials, Chapter: 1
Architecting for Big Data: Trends, Tips, and Deployment Options
Total Data Industry Report
BAR360 open data platform presentation at DAMA, Sydney
Level Seven - Expedient Big Data presentation
Big Data: Setting Up the Big Data Lake
Big Data Analyst at BankofAmerica
Big Data, NoSQL, NewSQL & The Future of Data Management
Omnichannel Challenges and Resolutions
Simplifying Data Interoperability with Geo Addressing and Enrichment
Reinvent Your Data Management Strategy for Successful Digital Transformation
Creating a Data Driven Organization - StampedeCon 2016
Big Data Analytics with Microsoft
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
What is Big Data?
Big-Data-Seminar-6-Aug-2014-Koenig

Recently uploaded (20)

PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PDF
Co-training pseudo-labeling for text classification with support vector machi...
PDF
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PPTX
agenticai-neweraofintelligence-250529192801-1b5e6870.pptx
PPTX
Internet of Everything -Basic concepts details
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
zbrain.ai-Scope Key Metrics Configuration and Best Practices.pdf
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
A symptom-driven medical diagnosis support model based on machine learning te...
PPTX
Build automations faster and more reliably with UiPath ScreenPlay
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PDF
Lung cancer patients survival prediction using outlier detection and optimize...
PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PDF
LMS bot: enhanced learning management systems for improved student learning e...
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PDF
The AI Revolution in Customer Service - 2025
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
Co-training pseudo-labeling for text classification with support vector machi...
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
EIS-Webinar-Regulated-Industries-2025-08.pdf
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
agenticai-neweraofintelligence-250529192801-1b5e6870.pptx
Internet of Everything -Basic concepts details
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
zbrain.ai-Scope Key Metrics Configuration and Best Practices.pdf
Training Program for knowledge in solar cell and solar industry
A symptom-driven medical diagnosis support model based on machine learning te...
Build automations faster and more reliably with UiPath ScreenPlay
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Lung cancer patients survival prediction using outlier detection and optimize...
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
LMS bot: enhanced learning management systems for improved student learning e...
Early detection and classification of bone marrow changes in lumbar vertebrae...
The AI Revolution in Customer Service - 2025
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf

Big data why big data is huge for CPG manufacturers

  • 1. Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 2. JANET DORENKOTT, BIO • Over 20 years of experience in information technology. • Founded Relational Solutions in 1996 and co-owns with Rob York. • Focused on data warehousing, data integration & business intelligence solutions • Specialize in the complex issues associated with integrating point of sale and syndicated data for the CPG industry & developed applications including POSmart and BlueSky, designed for handling data complexities unique to CPG companies. • Member of Retailwire’s Braintrust • Founder of the Demand Signal Repository Institute on LinkedIn. • Participated in the implementation of over 200 data warehouse and BI projects for companies that include Chrysler, Chase, Timken, Xerox, Glaxo, Smuckers, P&G and many others. Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 3. GOALS FOR TODAY • TO DEFINE BIG DATA • EXPLAIN HOW BIG DATA CAN IMPROVE BUSINESS • EXPLAIN HOW TO USE IT • SHOW THE IMPORTANCE OF LEVERAGING SOCIAL MEDIA
  • 4. “Top 10 “Companies on the Move” BlueSky Integration Studio “Best at integrating POS with Internal data” Cleveland Weatherhead 100 Fastest Growing Businesses Oracle Developer of the Year Data Warehouse & BI Consulting 1996 - 98 1999 - 01 2002 – 04 2005 - 06 2007 - 08 2009 - 10 2011 – 12 2013 “Data Warehouse of the Year!” BlueSky “Coolest New Technologies” DataStage ETL Best Implementors Award Informatica’s Partner of the Year Selects BIS to integrate POS & TradeEdge Selects POSmart to embed in DSR Best Software” Finalist BIG DATA… IT’S IN OUR BLOOD! Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 5. BUSINESS INTELLIGENCE • Leverages data to provide users with “Fact Based Decision” capability. • Derived from an enterprise data warehouse for management decisions • Reports are also derived from “stove pipe” solutions, ERP applications and homemade integration processes. • Operational reports are not the same as Analytical reports. Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 6. TRANSACTIONAL VS. ANALYTICAL REPORTING TRANSACTIONAL SYSTEM • DATABASE STRUCTURE DESIGNED FOR DATA ENTRY, UPDATE, AND PROCESSING. • OPERATIONAL REPORTS. • REPORTING USERS CAN IMPACT PROCESSING - QUICKLY BECOMES A SLOW ENVIRONMENT • PURCHASED APPLICATIONS CONTAIN STANDARD REPORTS • INCONSISTENT DUE TO “TWINKLING” • NO ACCESS TO SOME INFO • REPORTS CAN TAKE DAYS OR BE IMPOSSIBLE TO GET • NORMALIZED MODEL FOR FAST INPUT DATA WAREHOUSE • DATA MODEL DESIGNED FOR ANALYTICAL REPORTING AND AD-HOC QUERIES, BOTH FROM A CREATION AND A PERFORMANCE STANDPOINT • FREQUENTLY CONTAINS DETAIL DATA AND PRE-AGGREGATED SUMMARIES FOR FAST REPORTING • TOOLS ALLOW END USERS TO INQUIRE, DRILL FROM SUMMARY TO DETAIL • REPORTING USERS DO NOT IMPACT THE TRANSACTIONAL SYSTEM • OFTEN COMBINES DATA FROM MULTIPLE TRANSACTIONAL SYSTEMS • CONSISTENT – BUSINESS RULES • TYPICALLY DENORMALIZED Data Mart Transactional System e.g. SAP JDE Oracle Apps JDA Homegrown Data Mart Data Mart Data Mart Data Mart Data Mart Periodic Data Feeds Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 7. BIG DATA STARTED WITH ERP AND DATA WAREHOUSING • DATA MART: FOCUSED COLLECTION OF SIMILAR DATA FOR REPORTING PURPOSES Sales Data Mart Finance Data Mart Forecasting Data Mart International Sales Data Mart Vendor Information Data Mart  DATA WAREHOUSE: INTEGRATION OF MULTIPLE DATA MARTS INTO AN ENTERPRISE SOLUTION Marketing Data Mart Common Reference Values Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 8. THE BIG DATA EXPLOSION! Accounting Shipments Order Processing Manufacturing Transactional/ERP Analytical Big Data Currency Conversion Weather Trends SMS/MSS Photo’s Syndicated Data Web & Outside Data Sources EDW CRM Loyalty Segmentation Panel Data Wholesaler, Distributor & Broker Data Promotion Results Web Logs EDI Retailer POS Web Logs 3rd Party Data Click Stream Audio Textual Content Video Reputation Management Social Media Chatter Blogs Location Info 3-D Content Schmatics Geo-Spacial Speech to Text Demographics Emerging Market Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 9. WHAT’S THE DIFFERENCE? Un-Structured • Social Media • Chatter, Text Analytics, Blogs, Tweets, Comments, Likes, Followers, Social Authority, Clicks, Tags, etc. • Digital, Video • Audio • Geo-Spacial Multi-Structured /Hybrid • Emerging Market Data • Loyalty • E-Commerce • Other Third Party Data • Weather • Currency Conversion • Demographic • Panel • POS, POL, IR, EDI, RFID, NFC, QR, IRI, Rsi, Nielsen, Other Syndicated, IMS, MSA, etc. Structured ERP & DW • Main Frame • SQL Server • Oracle • DB2 • Sybase • Access, Excel, txt, etc • Teradata • Neteeza, Other mpp • SAP, JDE, JDA, Other ERP. Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 10. VOLUME! 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 11. IT’S NOT JUST SIZE , VARIETY! EDI RFID SAP DB2 Oracle TXT SQL AS2 CRM TPO JDE QR ACESS Mobile EXCEL NPD IMS TPM E-Comerce CRM Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 12. IT’S NOT JUST VOLUME & VARIETY! VELOCITY MATTERS! • Daily • Weekly • Monthly • Quarterly • Annually • Every Hour • Every Minute • Every Second • Every Nano-Second! • Constantly Changing • Constantly Growning! Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 13. IT’S NOT JUST VOLUME & VARIETY & VELOCITY. COMPLEXITY! • Aligning Hierarchy’s • Integrating Internal Master Data with Retailer Master Data • Applying Various Calendars • Regional Territories • Geographic alignment • Currency Conversion • Emerging Market • Loyalty • Market Basket • Cleansing Issues • Re-cast Data • Slowly Changing Dimensions (how you want to handle history, new stores, etc). Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 14. WHAT IS HADOOP? •HADOOP IS AN OPEN SOURCE DATA LIBRARY WITH 2 KEY COMPONENTS: 1. DISTRIBUTED FILE SYSTEM (HDFS) – FOR HIGH BANDWIDTH, CLUSTER BASED STORAGE 2. DATA PROCESSING FRAMEWORK – USES “MAPREDUCE” TO DISTRIBUTE/MAP LARGE DATA SETS ACROSS MULTIPLE SERVERS. EACH SERVER CREATES A SUMMARY OF THE DATA THAT HAS BEEN ALLOCATED TO IT. FROM THERE, DATA IS “REDUCED” OR “AGGREGATED.” SIMPLY PUT, IT IS MAPPED, THEN REDUCED. “HADOOP LETS YOU DEAL WITH VOLUME, VELOCITY AND VARIETY OF DATA. IT TRANSFORMS COMMODITY HARDWARE AND PROVIDES AUTOMATIC FAILOVER.” OWEN O’MALLEY, ARCHITECT FOR MAPREDUCE & SECURITY. Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 15. WHAT IS MAPREDUCE? • A PARALLEL PROGRAMMING FRAMEWORK • MADE POPULAR BY GOOGLE • GENERATE SEARCH INDEXES • WEB SCORING ALGORITHMS • C++, JAVA, PYTHON, ETC. • HARNESS 1000S OF CPUS • MAPREDUCE PROVIDES • AUTOMATIC PARALLELIZATION • FAULT TOLERANCE • MONITORING & STATUS UPDATES “MAPREDUCE ALLOWS PROGRAMMERS WITHOUT ANY EXPERIENCE WITH PARALLEL AND DISTRIBUTED SYSTEMS TO EASILY UTILIZE THE RESOURCES OF A LARGE DISTRIBUTED SYSTEM.” - JEFFREY DEAN AND SANJAY GHEMAWAT, GOOGLE, INC., 2004 Map Function Scheduler Results map shuffle reduce Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 16. MAPREDUCE IS SIMPLE WORD COUNT Unstructured Data Input Boat Yacht Lake House House Lake Boat House Yacht Fish Fish Fish Splitting Mapping Shuffling Reducing Result Boat Yacht Lake House House Lake Boat House Yacht Fish Fish Fish Boat, 1 Yacht, 1 Lake, 1 House, 1 House, 1 Lake, 1 Boat, 1 House, 1 Yacht, 1 Fish, 1 Fish, 1 Fish, 1 Boat, 1 Boat, 1 Yacht, 1 Yacht, 1 Lake, 1 Lake, 1 House, 1 House, 1 House, 1 Fish, 1 Fish, 1 Fish, 1 Boat, 2 Yacht, 2 Lake, 2 House, 3 Fish, 3 Boat, 2 Yacht, 2 Lake, 2 House, 3 Fish, 3 Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 17. COMMON TERMINOLOGY • PIG – HIGH LEVEL LANGUAGE THAT CONVERTS WORK TO MAPREDUCE • HIVE – TRANSFORMS & CONVERTS TO MAPREDUCE USING SQL • HBASE – SCALABLE, DISTRIBUTED DATABASE. PROVIDES A SIMPLE INTERFACE TO DATA (I.E. FACEBOOK MESSAGES UTILIZE THIS) • ZOOKEEPER – PROVIDES COORDINATION FOR SERVERS • HCATALOG – METADATA PULLED OUT OF HIVE • MAHOUT – MACHINE LEARNING LIBRARY • SCOOP – TOOL TO RUN MAPREDUCE APPS THAT PULL OR PUSH OUT OF SQL OR ORACLE • CASCADE – TRANSLATES DOWN INTO MAPREDUCE • OOZIE – WORKFLOW COORDINATION TO LEARN MAPREDUCE JOBS • FUSE DFS – USED TO ACCESS LINUX FILES Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 18. HOW CAN BIG DATA BE USED? • BIG DATA CAN BE USED TO MICRO-SEGMENT CUSTOMERS, ANALYZE SENTIMENT, PREDICT BEHAVIOR, PERSONALIZE OFFERS, CROSS-SELL AND UPSELL ACROSS CHANNELS, MANAGE REPUTATION, INCREASE SALE AND PROFITS. • COMPANIES NEED TO “WALK BEFORE YOU RUN.” • THE “BUILD IT & THEY WILL COME” PHILOSOPHY RARELY WORKS. IDENTIFY A BUSINESS NEED. Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 19. SOCIAL MEDIA REQUIRES YOU TO LISTEN ENGAGE INFORM OFFER Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 20. LEVERAGING THE DATA MEANS YOU NEED TO ACCESS ANALYZE ACT Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 21. IS SOCIAL MEDIA REALLY WORTH LEVERAGING? ACCORDING TO THE PEW RESEARCH CENTER: • 100 MILLION ACTIVE USERS • 50 MILLION LOG ON TO TWITTER EVERYDAY • 55% ARE MOBILE USERS ------------------------------------------- • AVERAGE TWEETS SENT PER DAY (IN MILLIONS): • IN JANUARY, 2010 – 50 MILLION TWEETS PER SECOND • IN FEBRUARY, 2011 – 140 MILLION TWEETS PER SECOND • IN SEPTEMBER, 2011 – 230 MILLION TWEETS PER SECOND • There were 2.5 million tweets regarding Steve Jobs’ death in the first 13 hours after it was reported, which is about 53 tweets per second. • 6,939 Tweets per second in Japan on New Years Eve at Midnight According to McKinsey Global Institute: • Facebook – 700,000,000,000 minutes spent/month • Google – 34,000 search/sec • Email – 838,000,000 messages in 2013 • Twitter – 500,000,000 tweets/day Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 22. IT’S ONLY JUST BEGUN! • LINKEDIN • FACEBOOK • YOUTUBE • SLIDESHARE • BRIGHTTALK.COM • SCRIBED • NAYMZ • JIGSAW • SPOKE • G+ • TWITTER • VINE • INSTAGRAM • BING Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 23. UNDERSTAND YOUR INTERACTIONS Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 24. KNOW YOUR SOCIAL REPUTATION Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 25. KNOW WHERE YOUR SENTIMENT IS COMING FROM Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 26. SEE WHERE YOUR CHAMPIONS ARE Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 27. UNDERSTAND WHERE YOU NEED DAMAGE CONTROL Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 28. WHAT ARE YOUR FOLLOWERS SAYING Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,
  • 29. GOALS FOR TODAY – ACCOMPLISHED! • TO DEFINE BIG DATA – VOLUME, VARIETY, VELOCITY & COMPLEXITY • EXPLAIN HOW BIG DATA CAN IMPROVE BUSINESS – LISTEN, ENGAGE, INFORM & OFFER • EXPLAIN HOW TO USE IT – LEVERAGING A FOUNDATION • SHOW THE IMPORTANCE OF LEVERAGING SOCIAL MEDIA – INTEGRATE WITH OTHER DATA
  • 30. THANK YOU & STAY TUNED! • FOLLOW JANET DORENKOTT ON LINKEDIN, EMAIL [email protected] • CALL US AT 440-899-3296, JANET IS X225 / KAREN IS X 232 • FOLLOW RELATIONAL SOLUTIONS ON LINKEDIN, TWITTER @POSMARTBLUESKY & ON FACEBOOK • JOIN OUR “DEMAND SIGNAL REPOSITORY INSTITUTE” & “BIG DATA ASSOCIATION” GROUP ON LINKEDIN • SUBSCRIBE TO THE RELATIONAL SOLUTIONS CHANNEL ON YOUTUBE: • RELATIONAL SOLUTIONS CHANNEL • VISIT US AT WWW.RELATIONALSOLUTIONS.COM OR CALL 440-899-3296 X225 • LEARN MORE FROM OUR WEBINARS & DOWNLOAD OUR WHITEPAPERS • SEE PRODUCT DEMO’S & DOWNLOAD TRIALS FROM OUR WEBSITE Property of Relational Solutions, Inc. By Janet Dorenkott June, 2013,