Applying Big Data
Presented by John Dougherty, Viriton
4/25/2013
john.dougherty@viriton.com
Big Data Buzzwords
• Volume, Velocity, and Variety
• Agility/Agile Development
• Modeling Data
The 3V’s originated in the early 2000’s. META (Gartner, now)
Volume…self contained. Velocity = Speed of transaction
Variety = Data profiling from multiple data sources
The Agile Manifesto, created February 2001 (Remember Scrum?)
Incorporation into Big Data software becoming mandatory
Adaptive and Predictive approaches are hotly contested
Data Modeling is paramount, given Big or Small datasets
Design must be confronted at ingress and egress
Hybrid data modeling and remodeling existing models
Veracity has been added, but has not yet been fully adopted
Big Data Buzzwords – Agile Dev.
• Informatics
• Daily Batch
• Classic Dept.
Informaticists are leveraged across multiple disciplines
There is no strict definition for a data scientist/informaticist
Greatest likelihood to adopt an agile/adaptive model
Development _should_ be incorporated into existing process
workflows. Seamlessness should be the goal.
Utilizing an agile approach to finding new uses to existing data
Least likely to need/adopt new development approaches
Relevant data must still be filtered through
Staff should not be re-learning the wheel with deployment
• Example of Hybrid Modeling
• Every project/objective must have properly
defined models to reach maximum efficacy
• Data silos are losing their complicit positioning
• Transitioning modeling to enumeration
Big Data Buzzwords – Data Models
Big Data Buzzwords – Question Inception
Connecting these lines is a great
example of the work that lies ahead in
identifying the objectives and goals of
the business environment
Big Picture
There is a lot of data
As of 2009, Google generates at least >2 EB per
year, >2TB indexed URLs, >9B page views per day
Facebook houses one billion users; utilizing >500TB
per day, housing 35% or more of the world’s photos
YouTube houses >1EB of data, >72 hours of video
per minute, >4B views per day
Twitter >125B tweets per year, >390M per
day, approximately 4500 per second
~2.3B people use the internet today, of which, 90% of
the world’s data has been generated within the last
two years
The Internet of Things
(connected devices and data)
What will you be aggregating?
In 2002, recorded media and electronic information flows
generated about 22 exabytes (1018) of information
In 2006, the amount of digital information
created, captured, and replicated was 161 EB
Use Cases
IBM’s 5 High Value Use Cases
Big Data Exploration
Find, visualize, understand all big data to improve decision making. Big data exploration addresses the challenge
that every large organization faces: information is stored in many different systems and silos and people need
access to that data to do their day-to-day work and make important decisions.
Enhanced 360º View of the Customer
Extend existing customer views by incorporating additional internal and external information sources. Gain a full
understanding of customers—what makes them tick, why they buy, how they prefer to shop, why they switch, what
they’ll buy next, and what factors lead them to recommend a company to others.
Security/Intelligence Extension
Lower risk, detect fraud and monitor cyber security in real time. Augment and enhance cyber security and
intelligence analysis platforms with big data technologies to process and analyze new types (e.g. social
media, emails, sensors, Telco) and sources of under-leveraged data to significantly improve intelligence, security
and law enforcement insight
Operations Analysis
Analyze a variety of machine and operational data for improved business results. The abundance and growth of
machine data, which can include anything from IT machines to sensors and meters and GPS devices requires
complex analysis and correlation across different types of data sets. By using big data for operations
analysis, organizations can gain real-time visibility into operations, customer experience, transactions and behavior.
Data Warehouse Augmentation
Integrate big data and data warehouse capabilities to increase operational efficiency. Optimize your data
warehouse to enable new types of analysis. Use big data technologies to set up a staging area or landing zone for
your new data before determining what data should be moved to the data warehouse. Offload infrequently accessed
or aged data from warehouse and application databases using information integration software and tools.
• Applied since data science began, 1970’s
• Many different products available, augmenting existing
solutions, and providing all-in-one
• SAS
• SAP (though called predictive analytics, still fits)
• Same problems incur with extensibility as do with
design/deployment
Use Cases
Visual Analytics
Science: Visual analytics is the
science of analytical reasoning
facilitated by interactive visual
interfaces
Sensor Analytics
Internet of Things: The first speaking of the gargantuan brontobyte
(1 Bit = Binary Digit · 8 Bits = 1 Byte · 1024 Bytes = 1 Kilobyte · 1024 Kilobytes = 1 Megabyte · 1024 Megabytes = 1 Gigabyte · 1024 Gigabytes = 1 Terabyte · 1024
Terabytes = 1 Petabyte · 1024 Petabytes = 1 Exabyte· 1024 Exabytes = 1 Zettabyte · 1024 Zettabytes = 1 Yottabyte · 1024 Yottabytes = 1 Brontobyte· 1024 Brontobytes =
1 Geopbyte)
• ROI Metrics are difficult to predict, but follow a
trend of double and triple digits
• What keeps the CEO up at night, decision
journeys
• An anecdotal report (questionarre) shows
44% of CMO’s can measure their ROI
• Design and development will continue to be
tantamount to a successful return
ROI?
ROI
Nucleus Research
• Becoming an analytic enterprise requires Big Data
• Average ROI of 241%
• Increased productivity
• A major metropolitan police department achieved an 863 percent ROI when it combined its criminal
records database with a national crime database created by a major university.
• Reduced labor costs
• A major resort earned an ROI of 1,822 percent when it integrated shift scheduling processes with data from a
national weather service, enabling managers to avoid unnecessary shift assignments and increase staff
utilization.
Four Stages of an
Analytic Enterprise
Telco reduces costs associated to CO management
and circuit deployment by 230%
QoS data expected to expand well into Petabytes for
the Telco industry
Moving Forward
How to formulate the right questions
• Communication between C-Suite and VP isn’t enough
• Considering old data, wholistic approaches work best
• Objectives and goals begin with dialogue at the highest
levels
• What are the questions we should be asking?
Should we start now? Yes.
Brought to you by:
BIBLIOGRAPHY
 http://blogs.starcio.com/2012/12/what-is-big-data-real-challenges-beyond.html - Big Data for All Businesses
 http://www.nytimes.com/2013/03/24/nyregion/mayor-bloombergs-geek-squad.html?pagewanted=all&_r=2& - NYC Mayor’s use case
 http://www.312analytics.com/what-is-machine-learning-big-data-modeling/ - Data Modeling, the big challenges
 http://goo.gl/wH3qG - Origin of VVV
 http://www.businessinsider.com/cia-presentation-on-big-data-2013-3?op=1 - CIA CTO Presentation
 http://www.rosebt.com/1/post/2013/03/data-science-and-analytics-workflow.html - Rose Business Technologies, Workflow
 http://nucleusresearch.com/research/notes-and-reports/the-big-returns-from-big-data/ - Nucleus Research

More Related Content

PDF
Data Analysis in Manufacturing Application to Steel Industry
PPTX
Big Data in Manufacturing Final PPT
PPTX
IoT and Big Data
PDF
A technical Introduction to Big Data Analytics
PPTX
Big data Introduction
PPTX
ParStream - Big Data for Business Users
PDF
BIG Data and Methodology-A review
PDF
ttec - ParStream
Data Analysis in Manufacturing Application to Steel Industry
Big Data in Manufacturing Final PPT
IoT and Big Data
A technical Introduction to Big Data Analytics
Big data Introduction
ParStream - Big Data for Business Users
BIG Data and Methodology-A review
ttec - ParStream

What's hot (20)

PPTX
Big data in manufacturing
PDF
Big data and analytics
PDF
Manufacturing Data Center Fast Facts: Big Data, Storage, Security & Recovery
PPTX
Michael Hummel - Stop Storing Data! - Parstream
PPTX
Big Data – Manufacturing
PDF
The current challenges and opportunities of big data and analytics in emergen...
PDF
Big Data & Analytics in the Manufacturing Industry: The Vaasan Group
PDF
Lean Production Meets Big Data: A Next Generation Use Case
PPTX
Big data & Its influence in the IT
PDF
Big Data - Insights & Challenges
PPTX
PPTX
The future of big data analytics
PDF
Big Data
PDF
Strategyzing big data in telco industry
PPTX
Data Science
PDF
IRJET- Scope of Big Data Analytics in Industrial Domain
PDF
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
PPTX
Big Data and Semantic Web in Manufacturing
PPTX
Big data
Big data in manufacturing
Big data and analytics
Manufacturing Data Center Fast Facts: Big Data, Storage, Security & Recovery
Michael Hummel - Stop Storing Data! - Parstream
Big Data – Manufacturing
The current challenges and opportunities of big data and analytics in emergen...
Big Data & Analytics in the Manufacturing Industry: The Vaasan Group
Lean Production Meets Big Data: A Next Generation Use Case
Big data & Its influence in the IT
Big Data - Insights & Challenges
The future of big data analytics
Big Data
Strategyzing big data in telco industry
Data Science
IRJET- Scope of Big Data Analytics in Industrial Domain
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
Big Data and Semantic Web in Manufacturing
Big data
Ad

Viewers also liked (20)

PPT
Enc 3241 document_design1
PPTX
페차쿠차_ 조연진
PPTX
페차쿠차
PPTX
Evaluation Question 4
PPTX
Catedra virtual de cultura ciudadana
PPTX
Qlitan wid my cousins
PPTX
PRUEBA TOEFL
PPTX
Evolución de los avances tecnológicos
PPTX
Subculture hippie
PPTX
Sitios de interes
ODT
Top 150 global design firms
PDF
Rosalia de Castro
PPTX
Big Data ROI
DOC
PPTX
Semantic Web (Web 3.0)
PPT
Aca advocacy
PDF
Hadoop Infrastructure (Oct. 3rd, 2012)
PPTX
Enc lecture day3
PPTX
Diapositiva asesores
PPSX
SEO Pricing & Cost
Enc 3241 document_design1
페차쿠차_ 조연진
페차쿠차
Evaluation Question 4
Catedra virtual de cultura ciudadana
Qlitan wid my cousins
PRUEBA TOEFL
Evolución de los avances tecnológicos
Subculture hippie
Sitios de interes
Top 150 global design firms
Rosalia de Castro
Big Data ROI
Semantic Web (Web 3.0)
Aca advocacy
Hadoop Infrastructure (Oct. 3rd, 2012)
Enc lecture day3
Diapositiva asesores
SEO Pricing & Cost
Ad

Similar to Applying Big Data (20)

PPTX
Introduction to Big Data
PPTX
final oracle presentation
PDF
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
PPTX
Kartikey tripathi
PDF
Sgcp14dunlea
PPTX
Building Confidence in Big Data - IBM Smarter Business 2013
PDF
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
PPTX
Big data.pptx
PDF
Accelerate Cloud Migrations and Architecture with Data Virtualization
PDF
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
PPTX
Big_Data_ppt[1] (1).pptx
PPTX
Aitp presentation ed holub - october 23 2010
PDF
Webinar: The 5 Most Critical Things to Understand About Modern Data Integration
DOCX
Content1. Introduction2. What is Big Data3. Characte.docx
PPTX
unit1 big data analysis description and defenition .pptx
PDF
IRJET- Search Improvement using Digital Thread in Data Analytics
PPTX
Big Data, NoSQL, NewSQL & The Future of Data Management
PPTX
Big data
PPTX
DEVOLSAFGSDFHGKJHJGHFGDFSDFDSDASFDGFUC.pptx
Introduction to Big Data
final oracle presentation
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
Kartikey tripathi
Sgcp14dunlea
Building Confidence in Big Data - IBM Smarter Business 2013
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Big data.pptx
Accelerate Cloud Migrations and Architecture with Data Virtualization
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Big_Data_ppt[1] (1).pptx
Aitp presentation ed holub - october 23 2010
Webinar: The 5 Most Critical Things to Understand About Modern Data Integration
Content1. Introduction2. What is Big Data3. Characte.docx
unit1 big data analysis description and defenition .pptx
IRJET- Search Improvement using Digital Thread in Data Analytics
Big Data, NoSQL, NewSQL & The Future of Data Management
Big data
DEVOLSAFGSDFHGKJHJGHFGDFSDFDSDASFDGFUC.pptx

Recently uploaded (20)

DOCX
search engine optimization ppt fir known well about this
PDF
Flame analysis and combustion estimation using large language and vision assi...
PPTX
Module 1 Introduction to Web Programming .pptx
DOCX
Basics of Cloud Computing - Cloud Ecosystem
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Five Habits of High-Impact Board Members
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
UiPath Agentic Automation session 1: RPA to Agents
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
4 layer Arch & Reference Arch of IoT.pdf
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
search engine optimization ppt fir known well about this
Flame analysis and combustion estimation using large language and vision assi...
Module 1 Introduction to Web Programming .pptx
Basics of Cloud Computing - Cloud Ecosystem
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Final SEM Unit 1 for mit wpu at pune .pptx
The influence of sentiment analysis in enhancing early warning system model f...
Five Habits of High-Impact Board Members
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Enhancing plagiarism detection using data pre-processing and machine learning...
UiPath Agentic Automation session 1: RPA to Agents
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
4 layer Arch & Reference Arch of IoT.pdf
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Module 1.ppt Iot fundamentals and Architecture
A proposed approach for plagiarism detection in Myanmar Unicode text
Convolutional neural network based encoder-decoder for efficient real-time ob...
Custom Battery Pack Design Considerations for Performance and Safety

Applying Big Data

  • 1. Applying Big Data Presented by John Dougherty, Viriton 4/25/2013 [email protected]
  • 2. Big Data Buzzwords • Volume, Velocity, and Variety • Agility/Agile Development • Modeling Data The 3V’s originated in the early 2000’s. META (Gartner, now) Volume…self contained. Velocity = Speed of transaction Variety = Data profiling from multiple data sources The Agile Manifesto, created February 2001 (Remember Scrum?) Incorporation into Big Data software becoming mandatory Adaptive and Predictive approaches are hotly contested Data Modeling is paramount, given Big or Small datasets Design must be confronted at ingress and egress Hybrid data modeling and remodeling existing models Veracity has been added, but has not yet been fully adopted
  • 3. Big Data Buzzwords – Agile Dev. • Informatics • Daily Batch • Classic Dept. Informaticists are leveraged across multiple disciplines There is no strict definition for a data scientist/informaticist Greatest likelihood to adopt an agile/adaptive model Development _should_ be incorporated into existing process workflows. Seamlessness should be the goal. Utilizing an agile approach to finding new uses to existing data Least likely to need/adopt new development approaches Relevant data must still be filtered through Staff should not be re-learning the wheel with deployment
  • 4. • Example of Hybrid Modeling • Every project/objective must have properly defined models to reach maximum efficacy • Data silos are losing their complicit positioning • Transitioning modeling to enumeration Big Data Buzzwords – Data Models
  • 5. Big Data Buzzwords – Question Inception Connecting these lines is a great example of the work that lies ahead in identifying the objectives and goals of the business environment
  • 6. Big Picture There is a lot of data As of 2009, Google generates at least >2 EB per year, >2TB indexed URLs, >9B page views per day Facebook houses one billion users; utilizing >500TB per day, housing 35% or more of the world’s photos YouTube houses >1EB of data, >72 hours of video per minute, >4B views per day Twitter >125B tweets per year, >390M per day, approximately 4500 per second ~2.3B people use the internet today, of which, 90% of the world’s data has been generated within the last two years The Internet of Things (connected devices and data) What will you be aggregating? In 2002, recorded media and electronic information flows generated about 22 exabytes (1018) of information In 2006, the amount of digital information created, captured, and replicated was 161 EB
  • 7. Use Cases IBM’s 5 High Value Use Cases Big Data Exploration Find, visualize, understand all big data to improve decision making. Big data exploration addresses the challenge that every large organization faces: information is stored in many different systems and silos and people need access to that data to do their day-to-day work and make important decisions. Enhanced 360º View of the Customer Extend existing customer views by incorporating additional internal and external information sources. Gain a full understanding of customers—what makes them tick, why they buy, how they prefer to shop, why they switch, what they’ll buy next, and what factors lead them to recommend a company to others. Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real time. Augment and enhance cyber security and intelligence analysis platforms with big data technologies to process and analyze new types (e.g. social media, emails, sensors, Telco) and sources of under-leveraged data to significantly improve intelligence, security and law enforcement insight Operations Analysis Analyze a variety of machine and operational data for improved business results. The abundance and growth of machine data, which can include anything from IT machines to sensors and meters and GPS devices requires complex analysis and correlation across different types of data sets. By using big data for operations analysis, organizations can gain real-time visibility into operations, customer experience, transactions and behavior. Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency. Optimize your data warehouse to enable new types of analysis. Use big data technologies to set up a staging area or landing zone for your new data before determining what data should be moved to the data warehouse. Offload infrequently accessed or aged data from warehouse and application databases using information integration software and tools.
  • 8. • Applied since data science began, 1970’s • Many different products available, augmenting existing solutions, and providing all-in-one • SAS • SAP (though called predictive analytics, still fits) • Same problems incur with extensibility as do with design/deployment Use Cases Visual Analytics Science: Visual analytics is the science of analytical reasoning facilitated by interactive visual interfaces Sensor Analytics Internet of Things: The first speaking of the gargantuan brontobyte (1 Bit = Binary Digit · 8 Bits = 1 Byte · 1024 Bytes = 1 Kilobyte · 1024 Kilobytes = 1 Megabyte · 1024 Megabytes = 1 Gigabyte · 1024 Gigabytes = 1 Terabyte · 1024 Terabytes = 1 Petabyte · 1024 Petabytes = 1 Exabyte· 1024 Exabytes = 1 Zettabyte · 1024 Zettabytes = 1 Yottabyte · 1024 Yottabytes = 1 Brontobyte· 1024 Brontobytes = 1 Geopbyte)
  • 9. • ROI Metrics are difficult to predict, but follow a trend of double and triple digits • What keeps the CEO up at night, decision journeys • An anecdotal report (questionarre) shows 44% of CMO’s can measure their ROI • Design and development will continue to be tantamount to a successful return ROI?
  • 10. ROI Nucleus Research • Becoming an analytic enterprise requires Big Data • Average ROI of 241% • Increased productivity • A major metropolitan police department achieved an 863 percent ROI when it combined its criminal records database with a national crime database created by a major university. • Reduced labor costs • A major resort earned an ROI of 1,822 percent when it integrated shift scheduling processes with data from a national weather service, enabling managers to avoid unnecessary shift assignments and increase staff utilization. Four Stages of an Analytic Enterprise Telco reduces costs associated to CO management and circuit deployment by 230% QoS data expected to expand well into Petabytes for the Telco industry
  • 11. Moving Forward How to formulate the right questions • Communication between C-Suite and VP isn’t enough • Considering old data, wholistic approaches work best • Objectives and goals begin with dialogue at the highest levels • What are the questions we should be asking? Should we start now? Yes. Brought to you by:
  • 12. BIBLIOGRAPHY  http://blogs.starcio.com/2012/12/what-is-big-data-real-challenges-beyond.html - Big Data for All Businesses  http://www.nytimes.com/2013/03/24/nyregion/mayor-bloombergs-geek-squad.html?pagewanted=all&_r=2& - NYC Mayor’s use case  http://www.312analytics.com/what-is-machine-learning-big-data-modeling/ - Data Modeling, the big challenges  http://goo.gl/wH3qG - Origin of VVV  http://www.businessinsider.com/cia-presentation-on-big-data-2013-3?op=1 - CIA CTO Presentation  http://www.rosebt.com/1/post/2013/03/data-science-and-analytics-workflow.html - Rose Business Technologies, Workflow  http://nucleusresearch.com/research/notes-and-reports/the-big-returns-from-big-data/ - Nucleus Research

Editor's Notes

  • #2: Thank you for coming to Big Data for BusinessNo other speakers, see if there is interest
  • #3: Concentrate on Veracity…not too muchDevelopment is not rigid, and agility may not be the only option.We have evidence that other approaches may yield just as good or better results.Data modeling is tantamount to a proper deployment
  • #4: Discuss why these are important to recognize for deployment and designThese styles all have similar issues with finding the right value
  • #5: Real time data flow is now the next step in finding answers.We still have to develop the right questions, or the right methods to finding the right questions
  • #6: Thank Michael Walker for the graphicThis illustrates a great abstraction for discourses at the business necessity perspective
  • #7: That’s a lot of data!End with the possibility of data aggregation sources, novel and extablished
  • #8: IBM has a pretty good grasp of Big Data’s implementations
  • #9: There are, fortunately or unfortunately, far fewer use cases than there are companies to provide solutions for those use cases
  • #10: Decision journeys for customers and predicting usage/purchasing patternsUtilized heavily in the Amazon space (both by Amazon and by their market source partners)
  • #11: Return on investment is proven, but not guaranteed for your businessFinding one massive return might justify the costs, but guaranteeing small returns will win more arguments
  • #12: There are a slew of resources available, and I will post this presentation along with other materials on the meetup page.Thanks again for coming, let’s have a discussion, and make sure to fill up
  • #13: This will be available online in a few days