SlideShare a Scribd company logo
Machine Learning for
Hackers
is how we make sense of big data.
Adam Gibson
2-27-2014 SFHN
BIG DATA & STATISTICS
• Statistics – Group by, aggregate, count,average, mean,p
values,mode,correlations, exploring, < 100 variables
• Machine Learning – Label this image, Predict the next
event, Pick out the anomalies – aka learn from data not
count it, group data by similarities, > 100 variables.
What is data?!
Many kinds of data Wow.


Unstructured



Text



SQL



Video



XML



Images



JSON



Time Series



CSV

Data Scientists

Structured

We know this, and just process it.
WHAT do machines learn?
• Machine learning is a general tool that can work with
•
•
•
•
•
•
•

various data types.
Images = Machine vision
Text = Natural-language processing
Time-series = Prediction
Facial recognition => Security
Text => Customer profiles/Recommendation engines
Time-series => stock-market trading platforms
NLP => Customer service
WHAT IS A DATA SCIENTIST?
Analyst
Exploratory analysis of data,
typically on smaller data sets.
Understands the algorithms
and interprets data.

Distributed Systems Engineer
Implements production data crunching,
also known as the nosql person.
They handle distributed systems
and workloads,
APIs, perhaps
even data collection and storage
What kinds of Machine Learning Are there?
Unsupervised – Clustering (group things that are similar,
regression (correlation != causation ring a bell?)
Supervised – Label all the things!
Predict the future!
How does this affect me?
I will leave who does this to your imagination


Ad Targeting



Recommends you Movies



Brings you search results



Recognizes your face in the camera



Drives your car



Automatically disables your credit card when you leave the country
Can I do this?
The shortcut here is to start with basics –
for example google analytics, understanding churn rate.
Pick up a more advanced understanding
after that if it still seems interesting.
If you are in to backends start with distributed systems,
get your math basics up enough to understand
what the guy on the other side of the table
who's asking you to put the algorithm in to production is saying
Resources
Coursera Machine Learning
Reddit Machine Learning
DataTau (hacker news for data scientists)
More mathy Stanford Machine Learning
Tools
Analysts

Data Engineers
Spark

http://scikit-learn.org/stable/
Julia Lang
R Lang

Hadoop
Storm
Hadoop QuickStart VM

More Related Content

What's hot (20)

PPTX
Handwritten bangla-digit-recognition-using-deep-learning
Sharmin Rubi
 
PPTX
Machine Learning techniques
Jigar Patel
 
PDF
Image captioning with Keras and Tensorflow - Debarko De @ Practo
Debarko De
 
PDF
Deep Learning
Shaikh Shahzad
 
PDF
Deep learning
Mohamed Loey
 
PPTX
Image captioning
Rajesh Shreedhar Bhat
 
PDF
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
NUS Institute of Applied Learning Sciences and Educational Technology
 
PDF
Alberto Massidda - Images and words: mechanics of automated captioning with n...
Codemotion
 
PDF
Human Emotion Recognition using Machine Learning
ijtsrd
 
PPTX
Deep learning tutorial 9/2019
Amr Rashed
 
PPTX
Basics of Soft Computing
Sangeetha Rajesh
 
PDF
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
PPT
Eckovation machine learning project
Vinod Jatav
 
PPTX
Deep Learning Projects - Anomaly Detection Using Deep Learning
DezyreAcademy
 
PDF
Deep learning - what is it and why now?
Natalia Konstantinova
 
PDF
Deep Learning Primer - a brief introduction
ananth
 
PDF
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
PPTX
Introduction to Deep Learning
Oswald Campesato
 
PPTX
SPEECH BASED EMOTION RECOGNITION USING VOICE
VamshidharSingh
 
PDF
Hot machine learning topics
WriteMyThesis
 
Handwritten bangla-digit-recognition-using-deep-learning
Sharmin Rubi
 
Machine Learning techniques
Jigar Patel
 
Image captioning with Keras and Tensorflow - Debarko De @ Practo
Debarko De
 
Deep Learning
Shaikh Shahzad
 
Deep learning
Mohamed Loey
 
Image captioning
Rajesh Shreedhar Bhat
 
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
NUS Institute of Applied Learning Sciences and Educational Technology
 
Alberto Massidda - Images and words: mechanics of automated captioning with n...
Codemotion
 
Human Emotion Recognition using Machine Learning
ijtsrd
 
Deep learning tutorial 9/2019
Amr Rashed
 
Basics of Soft Computing
Sangeetha Rajesh
 
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Eckovation machine learning project
Vinod Jatav
 
Deep Learning Projects - Anomaly Detection Using Deep Learning
DezyreAcademy
 
Deep learning - what is it and why now?
Natalia Konstantinova
 
Deep Learning Primer - a brief introduction
ananth
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
Introduction to Deep Learning
Oswald Campesato
 
SPEECH BASED EMOTION RECOGNITION USING VOICE
VamshidharSingh
 
Hot machine learning topics
WriteMyThesis
 

Similar to San Francisco Hacker News - Machine Learning for Hackers (20)

PPTX
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
Robert Williams
 
PDF
What is Data Science? Daniel D Gutierrez
amuletc
 
PPTX
Workshop_Presentation.pptx
RUDRAPRASADSABAR
 
PPTX
NCCU: The Story of Data Science and Machine Learning Workshop - A Tutorial in...
The Statistical and Applied Mathematical Sciences Institute
 
PDF
Data Science: The Art of Foul Play by Serhiy Shelpuk
SoftServe
 
PDF
Machine learing
Abu Saleh Muhammad Shaon
 
PDF
An Elementary Introduction to Artificial Intelligence, Data Science and Machi...
Dozie Agbo
 
PDF
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Data Science London
 
PPTX
MLIntro_ADA.pptx
ADA Consulting
 
PDF
Introduction to machine learning and applications (1)
Manjunath Sindagi
 
PDF
Ml in a day v 1.1
CCG
 
PPTX
Programming-Introduction-to-Machine-Learning.pptx
SaitoHiraga17
 
PPTX
Machine Learning using Big data
Vaibhav Kurkute
 
PPTX
Ml - A shallow dive
Gopi Krishna Nuti
 
PDF
Sql saturday el salvador 2016 - Me, A Data Scientist?
Fabricio Quintanilla
 
PPTX
Advanced Analytics and Data Science Expertise
SoftServe
 
PDF
Python Machine Learning - Getting Started
Rafey Iqbal Rahman
 
PPTX
Machine Learning Summary for Caltech2
Lukas Mandrake
 
PPTX
Data science
Viswateja Rayapaneni
 
PPS
Brief Tour of Machine Learning
butest
 
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
Robert Williams
 
What is Data Science? Daniel D Gutierrez
amuletc
 
Workshop_Presentation.pptx
RUDRAPRASADSABAR
 
NCCU: The Story of Data Science and Machine Learning Workshop - A Tutorial in...
The Statistical and Applied Mathematical Sciences Institute
 
Data Science: The Art of Foul Play by Serhiy Shelpuk
SoftServe
 
Machine learing
Abu Saleh Muhammad Shaon
 
An Elementary Introduction to Artificial Intelligence, Data Science and Machi...
Dozie Agbo
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Data Science London
 
MLIntro_ADA.pptx
ADA Consulting
 
Introduction to machine learning and applications (1)
Manjunath Sindagi
 
Ml in a day v 1.1
CCG
 
Programming-Introduction-to-Machine-Learning.pptx
SaitoHiraga17
 
Machine Learning using Big data
Vaibhav Kurkute
 
Ml - A shallow dive
Gopi Krishna Nuti
 
Sql saturday el salvador 2016 - Me, A Data Scientist?
Fabricio Quintanilla
 
Advanced Analytics and Data Science Expertise
SoftServe
 
Python Machine Learning - Getting Started
Rafey Iqbal Rahman
 
Machine Learning Summary for Caltech2
Lukas Mandrake
 
Data science
Viswateja Rayapaneni
 
Brief Tour of Machine Learning
butest
 

More from Adam Gibson (20)

PDF
End to end MLworkflows
Adam Gibson
 
PDF
World Artificial Intelligence Conference Shanghai 2018
Adam Gibson
 
PPTX
Deploying signature verification with deep learning
Adam Gibson
 
ODP
Self driving computers active learning workflows with human interpretable ve...
Adam Gibson
 
PDF
Anomaly Detection and Automatic Labeling with Deep Learning
Adam Gibson
 
PDF
Strata Beijing 2017: Jumpy, a python interface for nd4j
Adam Gibson
 
PPTX
Boolan machine learning summit
Adam Gibson
 
PDF
Advanced deeplearning4j features
Adam Gibson
 
PDF
Deep Learning with GPUs in Production - AI By the Bay
Adam Gibson
 
PDF
Big Data Analytics Tokyo
Adam Gibson
 
PDF
Wrangleconf Big Data Malaysia 2016
Adam Gibson
 
PDF
Distributed deep rl on spark strata singapore
Adam Gibson
 
PDF
Deep learning in production with the best
Adam Gibson
 
PPTX
Dl4j in the wild
Adam Gibson
 
PDF
SKIL - Dl4j in the wild meetup
Adam Gibson
 
PDF
Strata Beijing - Deep Learning in Production on Spark
Adam Gibson
 
PPTX
Anomaly detection in deep learning (Updated) English
Adam Gibson
 
PPTX
Skymind - Udacity China presentation
Adam Gibson
 
PDF
Anomaly Detection in Deep Learning (Updated)
Adam Gibson
 
PPTX
Hadoop summit 2016
Adam Gibson
 
End to end MLworkflows
Adam Gibson
 
World Artificial Intelligence Conference Shanghai 2018
Adam Gibson
 
Deploying signature verification with deep learning
Adam Gibson
 
Self driving computers active learning workflows with human interpretable ve...
Adam Gibson
 
Anomaly Detection and Automatic Labeling with Deep Learning
Adam Gibson
 
Strata Beijing 2017: Jumpy, a python interface for nd4j
Adam Gibson
 
Boolan machine learning summit
Adam Gibson
 
Advanced deeplearning4j features
Adam Gibson
 
Deep Learning with GPUs in Production - AI By the Bay
Adam Gibson
 
Big Data Analytics Tokyo
Adam Gibson
 
Wrangleconf Big Data Malaysia 2016
Adam Gibson
 
Distributed deep rl on spark strata singapore
Adam Gibson
 
Deep learning in production with the best
Adam Gibson
 
Dl4j in the wild
Adam Gibson
 
SKIL - Dl4j in the wild meetup
Adam Gibson
 
Strata Beijing - Deep Learning in Production on Spark
Adam Gibson
 
Anomaly detection in deep learning (Updated) English
Adam Gibson
 
Skymind - Udacity China presentation
Adam Gibson
 
Anomaly Detection in Deep Learning (Updated)
Adam Gibson
 
Hadoop summit 2016
Adam Gibson
 

Recently uploaded (20)

PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Learn Computer Forensics, Second Edition
AnuraShantha7
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PPTX
Q2 Leading a Tableau User Group - Onboarding
lward7
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Learn Computer Forensics, Second Edition
AnuraShantha7
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Q2 Leading a Tableau User Group - Onboarding
lward7
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
July Patch Tuesday
Ivanti
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 

San Francisco Hacker News - Machine Learning for Hackers

  • 1. Machine Learning for Hackers is how we make sense of big data. Adam Gibson 2-27-2014 SFHN
  • 2. BIG DATA & STATISTICS • Statistics – Group by, aggregate, count,average, mean,p values,mode,correlations, exploring, < 100 variables • Machine Learning – Label this image, Predict the next event, Pick out the anomalies – aka learn from data not count it, group data by similarities, > 100 variables.
  • 4. Many kinds of data Wow.  Unstructured  Text  SQL  Video  XML  Images  JSON  Time Series  CSV Data Scientists Structured We know this, and just process it.
  • 5. WHAT do machines learn? • Machine learning is a general tool that can work with • • • • • • • various data types. Images = Machine vision Text = Natural-language processing Time-series = Prediction Facial recognition => Security Text => Customer profiles/Recommendation engines Time-series => stock-market trading platforms NLP => Customer service
  • 6. WHAT IS A DATA SCIENTIST? Analyst Exploratory analysis of data, typically on smaller data sets. Understands the algorithms and interprets data. Distributed Systems Engineer Implements production data crunching, also known as the nosql person. They handle distributed systems and workloads, APIs, perhaps even data collection and storage
  • 7. What kinds of Machine Learning Are there? Unsupervised – Clustering (group things that are similar, regression (correlation != causation ring a bell?) Supervised – Label all the things! Predict the future!
  • 8. How does this affect me?
  • 9. I will leave who does this to your imagination  Ad Targeting  Recommends you Movies  Brings you search results  Recognizes your face in the camera  Drives your car  Automatically disables your credit card when you leave the country
  • 10. Can I do this? The shortcut here is to start with basics – for example google analytics, understanding churn rate. Pick up a more advanced understanding after that if it still seems interesting. If you are in to backends start with distributed systems, get your math basics up enough to understand what the guy on the other side of the table who's asking you to put the algorithm in to production is saying
  • 11. Resources Coursera Machine Learning Reddit Machine Learning DataTau (hacker news for data scientists) More mathy Stanford Machine Learning
  • 12. Tools