SlideShare a Scribd company logo
Big Data &
Machine Learning
Angelo MARIANO
@angelinux74
angelo.mariano@gmail.com
Summary
What I am trying to present...
● Everything is data
● Automation is everywhere
● Learning how to learn
Big data is like teenage sex:
everyone talks about it,
nobody really knows how to do it,
everyone thinks everyone else is doing it,
so everyone claims they are doing it...
Dan Ariely, Duke University
What is Big Data?
In very few words, we talk of big data when
we let a machine (a bunch of machines)
analyse huge amount of data
Data explosion
Data Sources
Who collects them?
● GPS
● Internet
● Smartphones
● Wearable devices
● Sensors
In 2020, there will be 150 billions of recording devices, 20x the Earth population
“Today, we are so dependent on oil, and oil is so
embedded in our daily doings, that we hardly
stop to comprehend its pervasive significance. It
is oil that makes possible where we live, how we
live, how we commute to work, how we travel –
even where we conduct our courtships. It is the
lifeblood of suburban communities.”
Daniel Yergin (1991) The Prize
Big Data & Machine Learning
Big Data Axes
2013
2016
MapReduce & HDFS
Big Data & Machine Learning
Automation
How to process big data?
● Everything will become intelligent; soon we will
not only have smart phones, but also smart
homes, smart factories and smart cities.
● There is an automation of data analysis. Artificial
intelligence is no longer programmed line by line,
but is now capable of learning, thereby
continuously developing itself.
● Algorithms can now recognize handwritten
language and patterns almost as well as humans
and even complete some tasks better than them.
● Today 70% of all financial transactions are
performed by algorithms
● Today, it is easier and cheaper to generate data
than ever before, and the tools to turn these
data into insights are growing exponentially in
both quality and quantity. So much so that any
organization dealing with data that does not
apply machine learning in some fashion will be
left behind.
AI is the new electricity. Just as 100 years
ago electricity transformed industry after
industry, AI will now do the same.
Andrew Ng, Oct. 2016, Fortune Magazine
Machine learning
● Through the machine learning technique computers are enabled to learn without having been
programmed to do so, thanks to a huge amount of data that is provided to them as study
material
● As humans learn from experience, these algorithms learn from the data. A child learns to
recognize a cat after having seen five / six specimens, a computer needs thousands and
thousands of examples, but it's amazing how many things they can do when they are able to
profit from their learning
● Once trained, the machines can make predictions. This is the most important skill of the AI
revolution: algorithms provide on Amazon books that we would like to read, on Facebook what
we want to see in our feed, but may provide, along with a doctor, even a disease in the early
stages a patient is developing
An example: Feed Forward Neural Net
Frank Rosenblatt (1957), The Perceptron - a perceiving and recognizing automaton 1
Universal Approximation Theorem
George Cybenko (1989), Kurt Hornik (1991)
Something new happens...
Deep
Neural
Network
Deep
Visualization
Toolbox
AlphaGo
Deep Learning for a Japanese farmer
So, follow the data. Choose a representation that can use
unsupervised learning on unlabeled data, which is so much
more plentiful than labeled data. Represent all the data
with a nonparametric model rather than trying to summarize
it with a parametric model, because with very large data
sources, the data holds a lot of detail.
Halevy, Norvig, Pereira (2009), The Unreasonable Effectiveness of Data
@angelinux74
angelo.mariano@gmail.com
What we
understand
is over the
surface...
References
● Dan Ariely on big data
● DevOps Borat on big data
● Data explosion
● Daniel Yergin, The prize
● Data is the new oil
● IBM infographic on big data
● Data never sleeps
● Data never sleeps 4.0
● MapReduce Google paper
● MapReduce Tutorial
● Google Filesystem
● HDFS Architecture
● Big Data & Analytics
● Big Data ecosystem
● Automation and democracy
● Andrew Ng, Fortune
● Frank Rosenblatt, the Perceptron
● Introduction to ANN
● Universal approximation theorem and its
illustration
● Deep neural networks
● Deep learning tutorial
● Multi GPU deep learning from Nvidia
● Deep Visualization Toolbox
● How Google's AlphaGo beat a Go world
champion
● A Japanese farmer and deep learning
● Eugene Wigner, The Unreasonable
Effectiveness of Mathematics in the Natural
Sciences
● The Unreasonable Effectiveness of Data
● Jeremy Howard, The wonderful and terrifying
implications of computers that can learn

More Related Content

PPTX
Big Data & Machine Learning - TDC2013 Sao Paulo
OCTO Technology
 
PDF
Big Data, Big Opportunities
Arimo, Inc.
 
PPTX
Machine Learning in Big Data
DataWorks Summit
 
PPTX
An Introduction to Big Data
eXascale Infolab
 
PPTX
Data science
SwapnilDahake2
 
PDF
Big Data - Insights & Challenges
Rupen Momaya
 
PDF
Big data Introduction by Mohan
Venkata Reddy Konasani
 
PPTX
Big Data Analytics Strategy and Roadmap
Srinath Perera
 
Big Data & Machine Learning - TDC2013 Sao Paulo
OCTO Technology
 
Big Data, Big Opportunities
Arimo, Inc.
 
Machine Learning in Big Data
DataWorks Summit
 
An Introduction to Big Data
eXascale Infolab
 
Data science
SwapnilDahake2
 
Big Data - Insights & Challenges
Rupen Momaya
 
Big data Introduction by Mohan
Venkata Reddy Konasani
 
Big Data Analytics Strategy and Roadmap
Srinath Perera
 

What's hot (20)

PPTX
Big data and its applications
ali easazadeh
 
PDF
Big data analytics 1
gauravsc36
 
PPTX
Big data
Pooja Shah
 
PDF
Big Data & the importance of Data Science
Wim Van Leuven
 
PPTX
Big data 101
Lars Marius Garshol
 
PPT
Big Data and Computer Science Education
James Hendler
 
PDF
Introduction to big data
Richard Vidgen
 
PDF
Big Data & Analytics (Conceptual and Practical Introduction)
Yaman Hajja, Ph.D.
 
PPTX
Big Data and the Art of Data Science
Andrew Gardner
 
PPTX
Introduction to Data Science
Srishti44
 
PPTX
Machine Learning in Big Data
DataWorks Summit/Hadoop Summit
 
PPTX
Big data Presentation
Aswadmehar
 
PPTX
Tools and Methods for Big Data Analytics by Dahl Winters
Melinda Thielbar
 
PDF
Big Data Information Architecture PowerPoint Presentation Slide
SlideTeam
 
PPTX
Lecture #01
Konpal Darakshan
 
PPTX
Overview of Big Data
LexiConn Content Services
 
PPTX
Data mining with big data implementation
Sandip Tipayle Patil
 
PPTX
Big Data for Beginners
Michael Perez
 
PDF
AI & Big Data Analytics : Innovation trends and use cases
Sarvesh Kumar
 
PPTX
Data Science Innovations : Democratisation of Data and Data Science
suresh sood
 
Big data and its applications
ali easazadeh
 
Big data analytics 1
gauravsc36
 
Big data
Pooja Shah
 
Big Data & the importance of Data Science
Wim Van Leuven
 
Big data 101
Lars Marius Garshol
 
Big Data and Computer Science Education
James Hendler
 
Introduction to big data
Richard Vidgen
 
Big Data & Analytics (Conceptual and Practical Introduction)
Yaman Hajja, Ph.D.
 
Big Data and the Art of Data Science
Andrew Gardner
 
Introduction to Data Science
Srishti44
 
Machine Learning in Big Data
DataWorks Summit/Hadoop Summit
 
Big data Presentation
Aswadmehar
 
Tools and Methods for Big Data Analytics by Dahl Winters
Melinda Thielbar
 
Big Data Information Architecture PowerPoint Presentation Slide
SlideTeam
 
Lecture #01
Konpal Darakshan
 
Overview of Big Data
LexiConn Content Services
 
Data mining with big data implementation
Sandip Tipayle Patil
 
Big Data for Beginners
Michael Perez
 
AI & Big Data Analytics : Innovation trends and use cases
Sarvesh Kumar
 
Data Science Innovations : Democratisation of Data and Data Science
suresh sood
 
Ad

Similar to Big Data & Machine Learning (20)

PDF
Big Data & Artificial Intelligence
Zavain Dar
 
PDF
Lecture2 - Machine Learning
Albert Orriols-Puig
 
DOCX
Machine Learning Fundamentals.docx
HaritvKrishnagiri
 
PPTX
BIG DATA AND MACHINE LEARNING
Umair Shafique
 
PPTX
Big Sky Earth 2018 Introduction to machine learning
Julien TREGUER
 
PPTX
Overview of Machine Learning and its Applications
Deepak Chawla
 
PPTX
Workshop_Presentation.pptx
RUDRAPRASADSABAR
 
PDF
Big data and AI presentation slides
CloudxLab
 
PPTX
Deep learning introduction
Adwait Bhave
 
PPTX
MACHINE LEARNING PPT.pptx for the machine learning studnets
AadityaRathi4
 
PPTX
Machine Learning using Big data
Vaibhav Kurkute
 
PPT
DEEP LEARNING PPT aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
RRamya22
 
PPTX
big data and machine learning ppt.pptx
NATASHABANO
 
PDF
Introduction ML - Introduçao a Machine learning
julianaantunes58
 
PPT
Eick/Alpaydin Introduction
butest
 
PPTX
Machine-Learning-and-Robotics.pptx
shohel rana
 
PDF
Machine Learning - Implementation with Python - 1
University College of Engineering Kakinada, JNTUK - Kakinada, India
 
PDF
Machine Learning Deep Learning AI and Data Science
Venkata Reddy Konasani
 
PDF
An Elementary Introduction to Artificial Intelligence, Data Science and Machi...
Dozie Agbo
 
PDF
Machine Learning on Big Data with HADOOP
EPAM Systems
 
Big Data & Artificial Intelligence
Zavain Dar
 
Lecture2 - Machine Learning
Albert Orriols-Puig
 
Machine Learning Fundamentals.docx
HaritvKrishnagiri
 
BIG DATA AND MACHINE LEARNING
Umair Shafique
 
Big Sky Earth 2018 Introduction to machine learning
Julien TREGUER
 
Overview of Machine Learning and its Applications
Deepak Chawla
 
Workshop_Presentation.pptx
RUDRAPRASADSABAR
 
Big data and AI presentation slides
CloudxLab
 
Deep learning introduction
Adwait Bhave
 
MACHINE LEARNING PPT.pptx for the machine learning studnets
AadityaRathi4
 
Machine Learning using Big data
Vaibhav Kurkute
 
DEEP LEARNING PPT aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
RRamya22
 
big data and machine learning ppt.pptx
NATASHABANO
 
Introduction ML - Introduçao a Machine learning
julianaantunes58
 
Eick/Alpaydin Introduction
butest
 
Machine-Learning-and-Robotics.pptx
shohel rana
 
Machine Learning - Implementation with Python - 1
University College of Engineering Kakinada, JNTUK - Kakinada, India
 
Machine Learning Deep Learning AI and Data Science
Venkata Reddy Konasani
 
An Elementary Introduction to Artificial Intelligence, Data Science and Machi...
Dozie Agbo
 
Machine Learning on Big Data with HADOOP
EPAM Systems
 
Ad

Recently uploaded (20)

PPTX
Quality control test for plastic & metal.pptx
shrutipandit17
 
PPTX
Sleep_pysilogy_types_REM_NREM_duration_Sleep center
muralinath2
 
PPTX
Qualification of.UV visible spectrophotometer pptx
shrutipandit17
 
DOCX
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
PDF
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
PPTX
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
PDF
A water-rich interior in the temperate sub-Neptune K2-18 b revealed by JWST
Sérgio Sacani
 
PDF
Sujay Rao Mandavilli Multi-barreled appraoch to educational reform FINAL FINA...
Sujay Rao Mandavilli
 
PPTX
The Toxic Effects of Aflatoxin B1 and Aflatoxin M1 on Kidney through Regulati...
OttokomaBonny
 
PPTX
Evolution of diet breadth in herbivorus insects.pptx
Mr. Suresh R. Jambagi
 
PPTX
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
PPTX
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
PPTX
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
PDF
Challenges of Transpiling Smalltalk to JavaScript
ESUG
 
PPT
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
PDF
JADESreveals a large population of low mass black holes at high redshift
Sérgio Sacani
 
PDF
Systems Biology: Integrating Engineering with Biological Research (www.kiu.a...
publication11
 
PPTX
Embark on a journey of cell division and it's stages
sakyierhianmontero
 
PDF
Control and coordination Class 10 Chapter 6
LataHolkar
 
PDF
Drones in Disaster Response: Real-Time Data Collection and Analysis (www.kiu...
publication11
 
Quality control test for plastic & metal.pptx
shrutipandit17
 
Sleep_pysilogy_types_REM_NREM_duration_Sleep center
muralinath2
 
Qualification of.UV visible spectrophotometer pptx
shrutipandit17
 
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
A water-rich interior in the temperate sub-Neptune K2-18 b revealed by JWST
Sérgio Sacani
 
Sujay Rao Mandavilli Multi-barreled appraoch to educational reform FINAL FINA...
Sujay Rao Mandavilli
 
The Toxic Effects of Aflatoxin B1 and Aflatoxin M1 on Kidney through Regulati...
OttokomaBonny
 
Evolution of diet breadth in herbivorus insects.pptx
Mr. Suresh R. Jambagi
 
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
Challenges of Transpiling Smalltalk to JavaScript
ESUG
 
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
JADESreveals a large population of low mass black holes at high redshift
Sérgio Sacani
 
Systems Biology: Integrating Engineering with Biological Research (www.kiu.a...
publication11
 
Embark on a journey of cell division and it's stages
sakyierhianmontero
 
Control and coordination Class 10 Chapter 6
LataHolkar
 
Drones in Disaster Response: Real-Time Data Collection and Analysis (www.kiu...
publication11
 

Big Data & Machine Learning

  • 1. Big Data & Machine Learning Angelo MARIANO @angelinux74 [email protected]
  • 2. Summary What I am trying to present... ● Everything is data ● Automation is everywhere ● Learning how to learn
  • 3. Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it... Dan Ariely, Duke University
  • 4. What is Big Data? In very few words, we talk of big data when we let a machine (a bunch of machines) analyse huge amount of data
  • 6. Data Sources Who collects them? ● GPS ● Internet ● Smartphones ● Wearable devices ● Sensors In 2020, there will be 150 billions of recording devices, 20x the Earth population
  • 7. “Today, we are so dependent on oil, and oil is so embedded in our daily doings, that we hardly stop to comprehend its pervasive significance. It is oil that makes possible where we live, how we live, how we commute to work, how we travel – even where we conduct our courtships. It is the lifeblood of suburban communities.” Daniel Yergin (1991) The Prize
  • 13. Automation How to process big data? ● Everything will become intelligent; soon we will not only have smart phones, but also smart homes, smart factories and smart cities. ● There is an automation of data analysis. Artificial intelligence is no longer programmed line by line, but is now capable of learning, thereby continuously developing itself. ● Algorithms can now recognize handwritten language and patterns almost as well as humans and even complete some tasks better than them. ● Today 70% of all financial transactions are performed by algorithms ● Today, it is easier and cheaper to generate data than ever before, and the tools to turn these data into insights are growing exponentially in both quality and quantity. So much so that any organization dealing with data that does not apply machine learning in some fashion will be left behind.
  • 14. AI is the new electricity. Just as 100 years ago electricity transformed industry after industry, AI will now do the same. Andrew Ng, Oct. 2016, Fortune Magazine
  • 15. Machine learning ● Through the machine learning technique computers are enabled to learn without having been programmed to do so, thanks to a huge amount of data that is provided to them as study material ● As humans learn from experience, these algorithms learn from the data. A child learns to recognize a cat after having seen five / six specimens, a computer needs thousands and thousands of examples, but it's amazing how many things they can do when they are able to profit from their learning ● Once trained, the machines can make predictions. This is the most important skill of the AI revolution: algorithms provide on Amazon books that we would like to read, on Facebook what we want to see in our feed, but may provide, along with a doctor, even a disease in the early stages a patient is developing
  • 16. An example: Feed Forward Neural Net Frank Rosenblatt (1957), The Perceptron - a perceiving and recognizing automaton 1
  • 17. Universal Approximation Theorem George Cybenko (1989), Kurt Hornik (1991)
  • 22. Deep Learning for a Japanese farmer
  • 23. So, follow the data. Choose a representation that can use unsupervised learning on unlabeled data, which is so much more plentiful than labeled data. Represent all the data with a nonparametric model rather than trying to summarize it with a parametric model, because with very large data sources, the data holds a lot of detail. Halevy, Norvig, Pereira (2009), The Unreasonable Effectiveness of Data
  • 25. References ● Dan Ariely on big data ● DevOps Borat on big data ● Data explosion ● Daniel Yergin, The prize ● Data is the new oil ● IBM infographic on big data ● Data never sleeps ● Data never sleeps 4.0 ● MapReduce Google paper ● MapReduce Tutorial ● Google Filesystem ● HDFS Architecture ● Big Data & Analytics ● Big Data ecosystem ● Automation and democracy ● Andrew Ng, Fortune ● Frank Rosenblatt, the Perceptron ● Introduction to ANN ● Universal approximation theorem and its illustration ● Deep neural networks ● Deep learning tutorial ● Multi GPU deep learning from Nvidia ● Deep Visualization Toolbox ● How Google's AlphaGo beat a Go world champion ● A Japanese farmer and deep learning ● Eugene Wigner, The Unreasonable Effectiveness of Mathematics in the Natural Sciences ● The Unreasonable Effectiveness of Data ● Jeremy Howard, The wonderful and terrifying implications of computers that can learn