Internet of Things
Aniekan Akpaffiong
College Speaker Series
January 2014
Introduction
• Agenda
– Internet of Things/Everything
– Big Data
– Cloud Computing
– Virtualization
What Is Internet Of Things
The general concept of the Internet of Things is
that we can put a sensor on anything and have
it send data back to a database through the
Internet.
In this way we can monitor everything,
everywhere and build smarter systems that are
more interactive than ever before.
Dan Rowinski, article on ReadWrite.com
Google Trends
Virtualization
Cloud Computing
Big Data
Internet Of Things
Graph generated 2014
Where Are The Things
How Many Things
• Projections Vary
– Uses cases still largely theoretical
• Depends on what’s counted
– E.g. include smart phones
• Beware of motivations
– Defining markets, setting agenda
Example Estimates
2020 50 billion Cisco Systems, 2013
2020 26 billion Gartner
Internet of Things
• “Internet of Things”
– Popularized in 1999 by MIT research group
– Sensor networks & content tagging to enable
interaction with physical and logical objects
– Core technologies
• Nanotechnology, intelligent
embedded systems, RFID,
sensor technology,…
– Platform technologies
• IPv6, Big Data, Virtualization,
Cloud Computing
– Standards & APIs
Entrance to SARAH, the artificially
Intelligent "home of the future” in Syfy’s Eureka.
IoT Components
• Enablers/Core
– Wireless
– ZigBee/NFC/RFID
– MEMS/Nanotech
– Content Analytics
• Platform
– IPv6
– Virtualization
– Cloud Computing
– Big Data
• Objects
Smart Lock, August.com
Internet of Things
• Smart Planet (IBM), Planetary Skin (Cisco),
CeNSE (HP),…
• Connections, it’s not just of people anymore
• IOT is about
– Data
– Sensors
– Control
– Analytics
– Networks
Internet of Things
• Applications of IoT
– Manufacturing
• Robotics, analytics, smart meters
– Retail
• Inventory tracking,
– Healthcare
• Remote monitoring, records
– Transportation
• Autonomous vehicles, GPS
– Home
• Monitoring, SMART devices, locks
Thermostat, nest.com
Internet of Things
• Concerns
– Competence
– Technocracy
– Panopticon
– Profiling
– Hacking
– Complexity
– Inevitability
– Data Ownership
– Costs
– Energy disney.wikia.com
I’m not bad, I’m just drawn that way. – Jessica Rabbit
“If you have something that you don't
want anyone to know, maybe you
shouldn't be doing it in the first
place.” – Google CEO Eric Schmidt, December 2009 CNBC interview
Privacy & The Internet
Gartner Hype Cycle 2013
www.gartner.com/newsroom/id/2575515
Internet of Things
Virtualization
Hardware (CPU, RAM, HDD)
Host Operating System
App App
Before Virtualization
Hardware (CPU, RAM, HDD)
Hypervisor
App
Virtual
Machine #1
Virtual
Machine #n
Host Operating
System 1
Host Operating
System n
App AppApp
After Virtualization
Virtualization is a methodology of dividing the resources of a computer
hardware and software into multiple execution environments, known
as virtual machines.
Virtualization Benefits
Server Consolidation
Power Savings
Data Center Space Decrease
Ease of use
High Availability
Simplified Management
Improved Go-To-Market Time
Enhanced Security
Reduced Networking Costs
Reduced Carbon Footprint
Reduced TCO
Desktop Consolidation
Why People Love it
Remember: Business cares about results, not virtualization
Virtualization Challenges
Management
Problem Isolation
Charge-back and Ownership model
Licensing
Security and compliance
Training
17
Why People Hate it
Cloud Computing
neokobo.blogspot.com/2013/11/139-cloud-computing.html
Cloud Computing
Storage and Compute
happens here
Input and consumption of resulting information happens here
Cloud
Cloud Computing
• Computer scientist John McCarthy in the
1960s predicted that, "Computation may
someday be organized as a public utility.“
• What is cloud computing?
– In one of the more bare-bones definition, it is the
ability to process information on someone else’s
device.
• Cloud computing essentially transfers
computing tasks to the Internet.
“Cloud computing is a model for enabling
ubiquitous, convenient, on-demand
network access to a shared pool of
configurable computing resources (e.g.,
networks, servers, storage, applications,
and services) that can be rapidly
provisioned and released with minimal
management effort or service provider
interaction” -- NIST
The service is provided over
a network as opposed to
directly cabled. A
broadband or high-speed
network access is assumed.
There are many use cases
and many possible solution
for cloud computing, any
definition should be a
model that can be modified
to fit a particular use case.
The resources have a
management user interface
allowing the user to configure
and customize the resources
as needed
The asset or service can be
requested (provisioned) more
quickly than in the traditional
IT model. Usually in seconds
or minutes.
The service provider should be
able to “set it and forget it”. After
the initial deployment, ongoing
maintenance requirements
should be minimal.
This implies a self-service model.
The user has direct access to the
resources and can provision
(request) and release (return) an
asset with minimum overhead.
The relevant
compute
resources are
pooled and then
logically divided
and shared
amongst the
users. Related
terms include
multi-tenancy
and elastic.
Cloud Computing – Deployment
Models
Cloud Types Properties
Private Cloud For the exclusive use of a single organization.
Can have multiple consumers, e.g. business units
May exist on or off premise
Community
Cloud
For exclusive use of a community with a shared concern or purpose.
Owned and managed by the community and/or a third party.
May exist on or off premise
Public Cloud Available to the general public
Owned and managed by a service provider organization.
Exists on the premise of the service provider.
Hybrid Cloud Composition of two or more distinct cloud infrastructures.
Bound together by standard or proprietary technology yet otherwise
distinct.
Enables data and app portability
Big Data
www-03.ibm.com/ibm/history/exhibits/storage/storage_PH0350A.html
www.hoax-slayer.com/1956-hard-disk-drive.shtml
Two things we do not see anymore today:
1. Pan-Am Airlines
2. The 1956 5 MB, 305 RAMAC HDD
weighing approximately one ton
Big Data
“There are more things in heaven and earth,
Horatio, than are dreamt of in your philosophy.”
– Hamlet
From the beginning of recorded time until 2003,
we created 5 billion gigabytes of data. In 2011
the same amount was created every two days.
By 2013 that time will shrink to 10 minutes.
– The Human Face of Big Data – Rick Smolan & Jennifer Erwitt
SI Decimal Prefix Value
Hard Drive Storage
(decimal)
Processor/Virtual Storage
(binary)
Binary Digit 100 1 Bit 1 Bit
Byte 8 8 Bits 8 Bits
Kilobyte 103 1000 Bytes 1024 Bytes
Megabyte 106 1000 Kilobytes 1024 Kilobytes
Gigabyte 109 1000 Megabytes 1024 Megabytes
Terabyte 1012 1000 Gigabyte 1024 Gigabyte
Petabyte 1015 1000 Terabyte 1024 Terabyte
Exabyte 1018 1000 Petabyte 1024 Petabyte
Zettabyte 1021 1000 Exabyte 1024 Exabyte
Yottabyte 1024 1000 Zettabyte 1024 Zettabyte
Brontobyte 1027 1000 Yottabyte 1024 Yottabyte
Geopbyte 1030 1000 Brontobyte 1024 Brontobyte
Binary
Decision
Text
Character
Half Page
One Min.
MP3 Audio 894,784
Plaintext Pages
4,581,298
books 268,435,456
MP3 Files
245 Million
DVDs
375 Trillion
Digital
Pictures
What Is Big Data?
• Big data
– Too big in size, created too fast & with little or no
standard structure
– Structured + Semi-Structured + Unstructured Data
– Social media, sensors, retail, weather, etc.
– Difficult to process using traditional tools
• Big data spans three dimensions:
– Volume, Velocity and Variety.
The V’s of Big Data
Big Data Velocity Examples
• In one second on the internet there are…
– 197 Reddit votes cast
– 463 Instagram Photos posted
– 833 Tumblr posts posted
– 1024 Skype calls made
– 3935 Tweets tweeted
– 11574 Dropbox files uploaded
– 33,333 Google searches made
– 46,333 YouTube videos viewed
– 52,083 Facebook likes
Captured in 2014
Big Data
• Why Does Big data Matters?
– New Data
– Unlock Value
– Shape The Future
– Knowledge Is Power
• Data Sources
– Interconnectivity
– Machines
– Historical
Image from commons.wikimedia.org
Analyzing Big Data
• New Tools
– Data Scientist
– Scale-out Hardware
– Parallel Programming Algorithms
• MapReduce
• Hadoop
– Fast Storage:
• In-memory
• SSD
• DAS
– Languages for data mining/analytics: R, NoSQL,
Pig/Hive/Hadoop, C/C++, Perl,…
Criticizing Big Data
• Privacy
• “Truthiness”
• “Forest For The Tree”
• Decisions based on the past
• Correlation does not mean causation.
• “Big Judgment”
– Providing context for big data outcomes
…not everything that can be
counted counts, and not
everything that counts can be
counted. – William Bruce Cameron
Big Data Analytics
“In a world where every click can be tracked and
recorded, we shouldn’t be managing customers
by putting them into groups of similar people,
we really shouldn’t be guessing. We should be
able to read the signals that customers are
telling us to figure out what they want. I call
that personalization.”
– Nilan Peiris, CMTO HolidayExtras.com
Big Data
Known Knowns
• Things we know and we
know that we know
• working data
Known Unknowns
• Things we know that we
do NOT know
• data to be acquired
Unknown Knowns
• Things we do NOT know
that we know
• forgotten data
Unknown Unknowns
• Things we do NOT know
that we do NOT know
• data ignorance
Popularized partly by Donald Rumsfeld
data
engineer
Data Information Knowledge Understanding Wisdom
data
engineer
data
analyst
data
miner
data
scientist
raw data context experience models prediction
The Data Scientist can obtain actionable insight from an analysis of the data.
Big Data
Smart Dust
The Sensors That Track Every Thing, Everywhere
Sensors are ubiquitous, not just in devices
Thank You

More Related Content

PPTX
Cloud-Based Big Data Analytics
PPTX
Introduction to Cloud computing and Big Data-Hadoop
PPTX
Cloud Computing, SDN, Big Data and Internet of Everything - Lew Tucker
PPTX
Relationship between cloud computing and big data
PDF
Cloud computing & big data for service innovation & learning
PPTX
Big Data in the Cloud
PDF
Introduction to Cloud Computing and Big Data
PDF
Big data and cloud computing 9 sep-2017
Cloud-Based Big Data Analytics
Introduction to Cloud computing and Big Data-Hadoop
Cloud Computing, SDN, Big Data and Internet of Everything - Lew Tucker
Relationship between cloud computing and big data
Cloud computing & big data for service innovation & learning
Big Data in the Cloud
Introduction to Cloud Computing and Big Data
Big data and cloud computing 9 sep-2017

What's hot (20)

PPTX
Cloud computing and big data analytics
PPTX
The rise of “Big Data” on cloud computing
PPTX
Cloud Computing and Big Data
PDF
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...
PDF
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
PPTX
Big Data & Hadoop Introduction
PDF
DataEd Online: Demystifying Big Data
PPTX
re:Invent re:Cap - Big Data & IoT at Any Scale
PDF
Creative Media Days 2012 Talk on Opportunistic Activity Modeling
PPTX
Big Data’s Big Impact on Businesses
PPTX
The Future of Data Science
PDF
Research paper on big data and hadoop
PPTX
Research: The Internet of Things
PPTX
GP-Write computing group
PPTX
2018 05 hype lightning talk
PDF
Big Data Hadoop Training by Easylearning Guru
PDF
Big Data
PDF
Big data privacy issues in public social media
PPTX
BIG DATA(PPT)
PDF
Driving AI Projects From Concept to the Real World
Cloud computing and big data analytics
The rise of “Big Data” on cloud computing
Cloud Computing and Big Data
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Big Data & Hadoop Introduction
DataEd Online: Demystifying Big Data
re:Invent re:Cap - Big Data & IoT at Any Scale
Creative Media Days 2012 Talk on Opportunistic Activity Modeling
Big Data’s Big Impact on Businesses
The Future of Data Science
Research paper on big data and hadoop
Research: The Internet of Things
GP-Write computing group
2018 05 hype lightning talk
Big Data Hadoop Training by Easylearning Guru
Big Data
Big data privacy issues in public social media
BIG DATA(PPT)
Driving AI Projects From Concept to the Real World
Ad

Similar to Internet of Things (20)

PPTX
Big data4businessusers
PPT
Big Data on The Cloud
PPTX
Big data business case
PDF
Data science and Artificial Intelligence
PDF
IoT and Big Data
PDF
Big Data made easy in the era of the Cloud - Demi Ben-Ari
PPTX
SKILLWISE-BIGDATA ANALYSIS
PPT
GK NU CS 101 Session 1B (1).ppt
PDF
Bertenthal
PDF
PPTX
Nicolas_Rafael_Palomino-W3C-Application.pptx
DOCX
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
PPTX
Big data ppt
PDF
Short introduction to Big Data Analytics, the Internet of Things, and their s...
PDF
Lesson-4-Emerging-Technology BY AAFREEN SHAIKH.pdf
PDF
Bigdatappt 140225061440-phpapp01
PPTX
Knowledge of IoT
PDF
Recent developments in data analytics and big data
PDF
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
PPTX
FinalPPT-StJoseph (3).pptx
Big data4businessusers
Big Data on The Cloud
Big data business case
Data science and Artificial Intelligence
IoT and Big Data
Big Data made easy in the era of the Cloud - Demi Ben-Ari
SKILLWISE-BIGDATA ANALYSIS
GK NU CS 101 Session 1B (1).ppt
Bertenthal
Nicolas_Rafael_Palomino-W3C-Application.pptx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
Big data ppt
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Lesson-4-Emerging-Technology BY AAFREEN SHAIKH.pdf
Bigdatappt 140225061440-phpapp01
Knowledge of IoT
Recent developments in data analytics and big data
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
FinalPPT-StJoseph (3).pptx
Ad

Recently uploaded (20)

PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PPTX
Configure Apache Mutual Authentication
PDF
Flame analysis and combustion estimation using large language and vision assi...
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
Statistics on Ai - sourced from AIPRM.pdf
PDF
giants, standing on the shoulders of - by Daniel Stenberg
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
DOCX
search engine optimization ppt fir known well about this
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
4 layer Arch & Reference Arch of IoT.pdf
PDF
Advancing precision in air quality forecasting through machine learning integ...
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
Configure Apache Mutual Authentication
Flame analysis and combustion estimation using large language and vision assi...
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
sbt 2.0: go big (Scala Days 2025 edition)
Statistics on Ai - sourced from AIPRM.pdf
giants, standing on the shoulders of - by Daniel Stenberg
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
search engine optimization ppt fir known well about this
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
Consumable AI The What, Why & How for Small Teams.pdf
Improvisation in detection of pomegranate leaf disease using transfer learni...
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Data Virtualization in Action: Scaling APIs and Apps with FME
Auditboard EB SOX Playbook 2023 edition.
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
4 layer Arch & Reference Arch of IoT.pdf
Advancing precision in air quality forecasting through machine learning integ...
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Dell Pro Micro: Speed customer interactions, patient processing, and learning...

Internet of Things

  • 1. Internet of Things Aniekan Akpaffiong College Speaker Series January 2014
  • 2. Introduction • Agenda – Internet of Things/Everything – Big Data – Cloud Computing – Virtualization
  • 3. What Is Internet Of Things The general concept of the Internet of Things is that we can put a sensor on anything and have it send data back to a database through the Internet. In this way we can monitor everything, everywhere and build smarter systems that are more interactive than ever before. Dan Rowinski, article on ReadWrite.com
  • 4. Google Trends Virtualization Cloud Computing Big Data Internet Of Things Graph generated 2014
  • 5. Where Are The Things
  • 6. How Many Things • Projections Vary – Uses cases still largely theoretical • Depends on what’s counted – E.g. include smart phones • Beware of motivations – Defining markets, setting agenda Example Estimates 2020 50 billion Cisco Systems, 2013 2020 26 billion Gartner
  • 7. Internet of Things • “Internet of Things” – Popularized in 1999 by MIT research group – Sensor networks & content tagging to enable interaction with physical and logical objects – Core technologies • Nanotechnology, intelligent embedded systems, RFID, sensor technology,… – Platform technologies • IPv6, Big Data, Virtualization, Cloud Computing – Standards & APIs Entrance to SARAH, the artificially Intelligent "home of the future” in Syfy’s Eureka.
  • 8. IoT Components • Enablers/Core – Wireless – ZigBee/NFC/RFID – MEMS/Nanotech – Content Analytics • Platform – IPv6 – Virtualization – Cloud Computing – Big Data • Objects Smart Lock, August.com
  • 9. Internet of Things • Smart Planet (IBM), Planetary Skin (Cisco), CeNSE (HP),… • Connections, it’s not just of people anymore • IOT is about – Data – Sensors – Control – Analytics – Networks
  • 10. Internet of Things • Applications of IoT – Manufacturing • Robotics, analytics, smart meters – Retail • Inventory tracking, – Healthcare • Remote monitoring, records – Transportation • Autonomous vehicles, GPS – Home • Monitoring, SMART devices, locks Thermostat, nest.com
  • 11. Internet of Things • Concerns – Competence – Technocracy – Panopticon – Profiling – Hacking – Complexity – Inevitability – Data Ownership – Costs – Energy disney.wikia.com I’m not bad, I’m just drawn that way. – Jessica Rabbit
  • 12. “If you have something that you don't want anyone to know, maybe you shouldn't be doing it in the first place.” – Google CEO Eric Schmidt, December 2009 CNBC interview Privacy & The Internet
  • 13. Gartner Hype Cycle 2013 www.gartner.com/newsroom/id/2575515
  • 15. Virtualization Hardware (CPU, RAM, HDD) Host Operating System App App Before Virtualization Hardware (CPU, RAM, HDD) Hypervisor App Virtual Machine #1 Virtual Machine #n Host Operating System 1 Host Operating System n App AppApp After Virtualization Virtualization is a methodology of dividing the resources of a computer hardware and software into multiple execution environments, known as virtual machines.
  • 16. Virtualization Benefits Server Consolidation Power Savings Data Center Space Decrease Ease of use High Availability Simplified Management Improved Go-To-Market Time Enhanced Security Reduced Networking Costs Reduced Carbon Footprint Reduced TCO Desktop Consolidation Why People Love it Remember: Business cares about results, not virtualization
  • 17. Virtualization Challenges Management Problem Isolation Charge-back and Ownership model Licensing Security and compliance Training 17 Why People Hate it
  • 19. Cloud Computing Storage and Compute happens here Input and consumption of resulting information happens here Cloud
  • 20. Cloud Computing • Computer scientist John McCarthy in the 1960s predicted that, "Computation may someday be organized as a public utility.“ • What is cloud computing? – In one of the more bare-bones definition, it is the ability to process information on someone else’s device. • Cloud computing essentially transfers computing tasks to the Internet.
  • 21. “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” -- NIST The service is provided over a network as opposed to directly cabled. A broadband or high-speed network access is assumed. There are many use cases and many possible solution for cloud computing, any definition should be a model that can be modified to fit a particular use case. The resources have a management user interface allowing the user to configure and customize the resources as needed The asset or service can be requested (provisioned) more quickly than in the traditional IT model. Usually in seconds or minutes. The service provider should be able to “set it and forget it”. After the initial deployment, ongoing maintenance requirements should be minimal. This implies a self-service model. The user has direct access to the resources and can provision (request) and release (return) an asset with minimum overhead. The relevant compute resources are pooled and then logically divided and shared amongst the users. Related terms include multi-tenancy and elastic.
  • 22. Cloud Computing – Deployment Models Cloud Types Properties Private Cloud For the exclusive use of a single organization. Can have multiple consumers, e.g. business units May exist on or off premise Community Cloud For exclusive use of a community with a shared concern or purpose. Owned and managed by the community and/or a third party. May exist on or off premise Public Cloud Available to the general public Owned and managed by a service provider organization. Exists on the premise of the service provider. Hybrid Cloud Composition of two or more distinct cloud infrastructures. Bound together by standard or proprietary technology yet otherwise distinct. Enables data and app portability
  • 23. Big Data www-03.ibm.com/ibm/history/exhibits/storage/storage_PH0350A.html www.hoax-slayer.com/1956-hard-disk-drive.shtml Two things we do not see anymore today: 1. Pan-Am Airlines 2. The 1956 5 MB, 305 RAMAC HDD weighing approximately one ton
  • 24. Big Data “There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy.” – Hamlet From the beginning of recorded time until 2003, we created 5 billion gigabytes of data. In 2011 the same amount was created every two days. By 2013 that time will shrink to 10 minutes. – The Human Face of Big Data – Rick Smolan & Jennifer Erwitt
  • 25. SI Decimal Prefix Value Hard Drive Storage (decimal) Processor/Virtual Storage (binary) Binary Digit 100 1 Bit 1 Bit Byte 8 8 Bits 8 Bits Kilobyte 103 1000 Bytes 1024 Bytes Megabyte 106 1000 Kilobytes 1024 Kilobytes Gigabyte 109 1000 Megabytes 1024 Megabytes Terabyte 1012 1000 Gigabyte 1024 Gigabyte Petabyte 1015 1000 Terabyte 1024 Terabyte Exabyte 1018 1000 Petabyte 1024 Petabyte Zettabyte 1021 1000 Exabyte 1024 Exabyte Yottabyte 1024 1000 Zettabyte 1024 Zettabyte Brontobyte 1027 1000 Yottabyte 1024 Yottabyte Geopbyte 1030 1000 Brontobyte 1024 Brontobyte Binary Decision Text Character Half Page One Min. MP3 Audio 894,784 Plaintext Pages 4,581,298 books 268,435,456 MP3 Files 245 Million DVDs 375 Trillion Digital Pictures
  • 26. What Is Big Data? • Big data – Too big in size, created too fast & with little or no standard structure – Structured + Semi-Structured + Unstructured Data – Social media, sensors, retail, weather, etc. – Difficult to process using traditional tools • Big data spans three dimensions: – Volume, Velocity and Variety.
  • 27. The V’s of Big Data
  • 28. Big Data Velocity Examples • In one second on the internet there are… – 197 Reddit votes cast – 463 Instagram Photos posted – 833 Tumblr posts posted – 1024 Skype calls made – 3935 Tweets tweeted – 11574 Dropbox files uploaded – 33,333 Google searches made – 46,333 YouTube videos viewed – 52,083 Facebook likes Captured in 2014
  • 29. Big Data • Why Does Big data Matters? – New Data – Unlock Value – Shape The Future – Knowledge Is Power • Data Sources – Interconnectivity – Machines – Historical Image from commons.wikimedia.org
  • 30. Analyzing Big Data • New Tools – Data Scientist – Scale-out Hardware – Parallel Programming Algorithms • MapReduce • Hadoop – Fast Storage: • In-memory • SSD • DAS – Languages for data mining/analytics: R, NoSQL, Pig/Hive/Hadoop, C/C++, Perl,…
  • 31. Criticizing Big Data • Privacy • “Truthiness” • “Forest For The Tree” • Decisions based on the past • Correlation does not mean causation. • “Big Judgment” – Providing context for big data outcomes …not everything that can be counted counts, and not everything that counts can be counted. – William Bruce Cameron
  • 32. Big Data Analytics “In a world where every click can be tracked and recorded, we shouldn’t be managing customers by putting them into groups of similar people, we really shouldn’t be guessing. We should be able to read the signals that customers are telling us to figure out what they want. I call that personalization.” – Nilan Peiris, CMTO HolidayExtras.com
  • 33. Big Data Known Knowns • Things we know and we know that we know • working data Known Unknowns • Things we know that we do NOT know • data to be acquired Unknown Knowns • Things we do NOT know that we know • forgotten data Unknown Unknowns • Things we do NOT know that we do NOT know • data ignorance Popularized partly by Donald Rumsfeld
  • 34. data engineer Data Information Knowledge Understanding Wisdom data engineer data analyst data miner data scientist raw data context experience models prediction The Data Scientist can obtain actionable insight from an analysis of the data. Big Data
  • 35. Smart Dust The Sensors That Track Every Thing, Everywhere Sensors are ubiquitous, not just in devices