SlideShare a Scribd company logo
AWS & GOOGLE MACHINE LEARNING SERVICES
11.10.2017
Max Pagels, Machine Learning Specialist
@maxpagels, linkedin.com/in/maxpagels/
ABOUT ME
• BSc, MSc in Computer Science (University of Helsinki)
• Currently doing applied machine learning at SC5, a
consultancy based in Helsinki
• Former CS researcher
• Current favourite ML algorithm: LSTM neural networks
• Favourite programming languages: JavaScript for full-stack,
Python for ML
• Other hats I wear: full-stack developer, technical interviewer
AGENDA
Part 1: Overview of AWS & Google machine learning services
〜 〜 〜
Part 2: Classification demo using Google ML Engine and AWS ML
〜 〜 〜
Part 3 (time permitting): Free-form Q&A
IF YOU HAVE ANY QUESTIONS, AT ANY TIME, DO ASK!
MACHINE LEARNING
MACHINE LEARNING LEARNS FROM
DATA TO SOLVE COMPLEX TASKS
In contrast to traditional programming, machine learning
learns from data and automatically produces a program to
solve a task.
That makes it suitable for tasks that are impossible or
infeasible to code by hand, as well as automation of things
that are tedious but (until now) required human
supervision. ML is the foundation of modern artificial
intelligence.
AWS Machine Learning & Google Cloud Machine Learning
TRADITIONAL PROGRAMMING
Input → Algorithm → Output
MACHINE LEARNING
Input & output → Learning algorithm → Program
THERE ARE 3 MAIN TYPES OF MACHINE
LEARNING
SUPERVISED LEARNING
Learn from labelled data to build
a model for predicting a number
(regression) or a a discrete class
(classification) on new data
UNSUPERVISED LEARNING
Find structure in unlabelled data
to provide insights
REINFORCEMENT LEARNING
Take actions in the world, receive
rewards, and learn to maximise
reward over time
MACHINE LEARNING IS RESOURCE-
INTENSIVE
ML models learn from data using numerical
optimisation and linear algebra. It’s not uncommon
for each pass over training data to require
thousands or even millions of mathematical
operations. Inference (predicting/classifying new
examples) is also expensive.
THE CLOUD OFFERS CPU/GPU COMPUTE
ON-DEMAND
In addition to infrastructure-as-a-service, the big
cloud vendors (Azure, Google, AWS) also offer a
number of managed and hybrid ML & AI solutions.
LET’S TAKE A CLOSER LOOK AT THE
SERVICES PROVIDED BY GOOGLE AND
AMAZON (AWS)
GOOGLE CLOUD MACHINE LEARNING
From their website: “Google Cloud's AI provides modern machine learning
services, with pre-trained models and a service to generate your own tailored
models. Our neural net-based ML service has better training performance and
increased accuracy compared to other large scale deep learning systems. Our
services are fast, scalable and easy to use. Major Google applications use Cloud
machine learning, including Photos (image search), the Google app (voice search),
Translate, and Inbox (Smart Reply). Our platform is now available as a cloud
service to bring unmatched scale and speed to your business applications.”
BREAKDOWN OF GOOGLE AI SERVICES
FULLY MANAGED APIS
• Cloud Jobs: machine learning-powered
job search engine
• Cloud Video Intelligence: extract
metadata, identify key nouns, and
automatically annotate the content of
videos using a REST API
• Cloud Vision: image classification, object
recognition and OCR as-a-service
• Cloud Speech: audio-to-text
• Natural Language: extract information
about people, places, events etc.
Sentiment analysis supported
• Cloud Translation: language translation á
la Google Translate
HYBRID/BYO
• Machine Learning Engine: general-
purpose machine learning training and
inference engine
• Implement your own learning
algorithms in TensorFlow, provide
training data, train in the cloud without
worrying about servers
• Deploy trained models in the cloud for
a scalable server less prediction API
• Can also do training and/or inference
on your local machine
AWS MACHINE LEARNING SERVICES
From their website: “Within AWS, we’re focused on bringing that knowledge and
capability to you through three layers of the AI stack: Frameworks and
Infrastructure with tools like Apache MXNet and TensorFlow, API-driven
Services to quickly add intelligence to applications, and Machine Learning
Platforms for data scientists.”
BREAKDOWN OF AWS AI SERVICES
FULLY MANAGED APIS
• Amazon Lex: natural language
understanding and speech recognition,
powered by the same AIs used in Alexa
• Amazon Polly: text-to-speech as-a-
service
• Amazon Rekognition: ready-made image
recognition, object recognition, and OCR
FULLY MANAGED SERVICES
• Amazon Machine Learning: linear and
logistic regression as-a-service
• Provide data, choose learning
algorithm, train in the cloud
• Deploy a trained model as a
prediction API in the cloud
BYO/PLATFORM SERVICES
• Amazon EMR: Managed Hadoop/Spark
environment, implement your
algorithms/training/inference yourself
• Amazon Deep Learning AMIs: spin up
instances on EC2 preinstalled with
TensorFlow, MXnet, Theano, Caffe, CNTK,
Torch etc. and handle the rest yourself
• Amazon EC2: spin up instances with the
CPU/GPU power you require and install
whatever you like
A NOTE ON PRICING
Cloud services are typically pay-as-you-go/pay for what you use. For
machine learning, that usually means you pay for time/resources needed to
train, and time needed to do predictions. Unless you use a service that
explicitly spins up hardware and keeps it running, you typically don’t pay
anything if you aren’t doing training/inference.
PRICING EXAMPLE (CLOUD INFRASTRUCTURE)
PRICING EXAMPLE (CLOUD INFRASTRUCTURE)
Example: on AWS EC2, a p2.8xlarge instance has:
• 32 vCPUs
• 488 GiB RAM
• 8 NVIDIA K80 GPUs, 2,496 PPCs and 12GiB of
GPU memory per GPU
PRICING EXAMPLE (CLOUD INFRASTRUCTURE)
Cost of buying one K80 yourself: 5,000 €
Cost of buying the equivalent hardware yourself: 50,000 €
Cost of running the instance in AWS: about 8 € per hour
50,000 € equals 260 consecutive days of p2.8xlarge use
QUESTIONS SO FAR?
MULTI-CLASS CLASSIFICATION USING GOOGLE ML
ENGINE & AWS MACHINE LEARNING
THE IRIS DATASET
There are lots of freely available ML datasets online (for
example on Kaggle). One of them is the legendary Iris flower
dataset.
It includes three iris species with 50 samples each as well as
some properties (features) about each flower: sepal length,
sepal width, petal length & petal width (in cm).
With Google ML engine and AWS Machine learning, we are
going to train an ML model on the iris data to build a
classifier that can correctly classify new examples as one of
three iris species (classes).
LEARNING ALGORITHM: LOGISTIC
REGRESSION
Logistic regression is a simple learning algorithm. It’s similar
to linear regression, but meant for classification problems.
LOGISTIC REGRESSION OVERVIEW
1. Assume a linear relationship between features:
L = w₁ * sepal_width + w₂ * sepal_length + w₃ * petal_width + w₄
* petal_length
2.Use a sigmoid function to convert the result to a probability of
belonging to a class:
H(L) = sigmoid(L) = 1 / (1 + e^(-L))
3.Build 1) & 2) for each possible class
4.Iterate over our dataset, construct H(L) for each example, check how
far we were from the correct class (we have the correct answers in our
labelled dataset)
5.Adjust weights w₁, w₂, w₃ and w₄ so that, on average, we get things
less wrong next time (note: use partial derivatives)
6.Iterate 4)-5) until we achieve good accuracy (i.e. classify as many
examples correctly as possible)
7. Stop iterating when accuracy is “good enough” or after some
predetermined number of iterations
REMEMBER: G-I-G-O
GIGO stands for “Garbage in, garbage out”. Without quality,
cleaned source data machine learning won’t work well.
Some estimates say that feature engineering & data cleaning
account for 80% of data scentists’ work
WALKTHROUGH: AWS MACHINE LEARNING
WALKTHROUGH: GOOGLE CLOUD ML ENGINE
Google Cloud Machine Learning Engine AWS Machine Learning
Service type Hybrid Fully managed
Supported algorithms Linear and non-linear learners (DNNs, linear &
logistic regression, Bayesian learners etc.)
Only linear learners (linear & logistic
regression)
Algorithm implementation BYO: Build using Tensorflow (low-level API or
Estimators) or Keras (tf.contrib.keras)
Pre-defined (linear & logistic multi-class
regression)
Accepted data sources Google Cloud Storage, BigTable & other
Google Cloud platform storage services
S3 (CSV-formatted data), RedShift
Built-in data transformation tools Full control (TensorFlow functionality +
packaging of Python modules as dependencies)
Limited, using “Recipes” (editable in the
console UI)
Model training In the cloud or locally In the cloud
GPU support for training Yes No?
Hyperparameter tuning Full control + automatic tuning Limited manual tuning (regularisation, epochs)
Cross-validation Yes, configurable Yes (configurable train/test sets but no K-fold)
Model versioning Explicit Implicit
Underlying computation engine TensorFlow AWS EMR (Spark MLlib?)
Real-time predictions Yes, using Cloud Engine Prediction API Yes (built-in)
Batch predictions Yes, using Cloud Engine Prediction API Yes (built-in)
Monitoring Yes (Training jobs console, TensorBoard) Yes (CloudWatch and AWS ML UI)
WHAT WE SAW WAS TWO SIMPLE DEMOS…
BUT THE POSSIBILITIES ARE ENDLESS
THANK YOU!
QUESTIONS?

More Related Content

What's hot (20)

PDF
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Databricks
 
PDF
Nexxworks bootcamp ML6 (27/09/2017)
Karel Dumon
 
PPTX
Azure Machine Learning
Dmitry Petukhov
 
PDF
CI/CD for Machine Learning with Daniel Kobran
Databricks
 
PDF
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
Fei Chen
 
PDF
Google Cloud Platform for Data Science teams
Barton Rhodes
 
PDF
Cloud Native Data Pipelines
Bill Liu
 
PDF
Deep Learning on Apache Spark
Dash Desai
 
PPTX
Serverless Data Architecture at scale on Google Cloud Platform
MeetupDataScienceRoma
 
PDF
Metaflow: The ML Infrastructure at Netflix
Bill Liu
 
PPTX
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Seldon
 
PPTX
Microsoft Machine Learning Server. Architecture View
Dmitry Petukhov
 
PDF
Distributed Deep Learning on Spark
Mathieu Dumoulin
 
PDF
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
Databricks
 
PPTX
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sri Ambati
 
PPTX
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Sri Ambati
 
PPTX
ML6 talk at Nexxworks Bootcamp
Karel Dumon
 
PPTX
Scalable Machine Learning using R and Azure HDInsight - Parashar
Parashar Shah
 
PDF
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
PAPIs.io
 
PPTX
Machine Learning and Hadoop
Josh Patterson
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Databricks
 
Nexxworks bootcamp ML6 (27/09/2017)
Karel Dumon
 
Azure Machine Learning
Dmitry Petukhov
 
CI/CD for Machine Learning with Daniel Kobran
Databricks
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
Fei Chen
 
Google Cloud Platform for Data Science teams
Barton Rhodes
 
Cloud Native Data Pipelines
Bill Liu
 
Deep Learning on Apache Spark
Dash Desai
 
Serverless Data Architecture at scale on Google Cloud Platform
MeetupDataScienceRoma
 
Metaflow: The ML Infrastructure at Netflix
Bill Liu
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Seldon
 
Microsoft Machine Learning Server. Architecture View
Dmitry Petukhov
 
Distributed Deep Learning on Spark
Mathieu Dumoulin
 
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
Databricks
 
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sri Ambati
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Sri Ambati
 
ML6 talk at Nexxworks Bootcamp
Karel Dumon
 
Scalable Machine Learning using R and Azure HDInsight - Parashar
Parashar Shah
 
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
PAPIs.io
 
Machine Learning and Hadoop
Josh Patterson
 

Similar to AWS Machine Learning & Google Cloud Machine Learning (20)

PDF
Machine Learning for Developers
Danilo Poccia
 
PPTX
Presentazione tutorial
dariospin93
 
PDF
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
Amazon Web Services Korea
 
PPTX
AcademyMachineLearningFoundations-EN-ILT-02.pptx
yanguirania
 
PDF
Machine Learning for everyone
Julien SIMON
 
PDF
Demystifying Machine Learning - How to give your business superpowers.
10x Nation
 
PPTX
Deep Dive Amazon SageMaker
Cobus Bernard
 
PDF
Cloud Academy & AWS: how we use Amazon Web Services for machine learning and ...
Alex Casalboni
 
PDF
Ml ops on AWS
PhilipBasford
 
PPTX
Machine Learning on AWS
Stefan Bergstein
 
PPTX
WhereML a Serverless ML Powered Location Guessing Twitter Bot
Randall Hunt
 
PDF
Developer's Introduction to Machine Learning
Christopher Mohritz
 
PDF
AWS Summit Singapore 2019 | Build, Train and Deploy Deep Learning Models on A...
AWS Summits
 
PDF
ML crash course
mikaelhuss
 
PPTX
Build, Train and Deploy Machine Learning Models at Scale (April 2019)
Julien SIMON
 
PDF
AWS re:Invent Deep Learning: Goin Beyond Machine Learning (BDT311)
Chida Chidambaram
 
PPTX
Integrating Machine Learning Capabilities into your team
Cameron Vetter
 
PDF
Data Summer Conf 2018, “Build, train, and deploy machine learning models at s...
Provectus
 
PDF
MLops workshop AWS
Gili Nachum
 
PPTX
Machine Learning Startup
Ben Lackey
 
Machine Learning for Developers
Danilo Poccia
 
Presentazione tutorial
dariospin93
 
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
Amazon Web Services Korea
 
AcademyMachineLearningFoundations-EN-ILT-02.pptx
yanguirania
 
Machine Learning for everyone
Julien SIMON
 
Demystifying Machine Learning - How to give your business superpowers.
10x Nation
 
Deep Dive Amazon SageMaker
Cobus Bernard
 
Cloud Academy & AWS: how we use Amazon Web Services for machine learning and ...
Alex Casalboni
 
Ml ops on AWS
PhilipBasford
 
Machine Learning on AWS
Stefan Bergstein
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
Randall Hunt
 
Developer's Introduction to Machine Learning
Christopher Mohritz
 
AWS Summit Singapore 2019 | Build, Train and Deploy Deep Learning Models on A...
AWS Summits
 
ML crash course
mikaelhuss
 
Build, Train and Deploy Machine Learning Models at Scale (April 2019)
Julien SIMON
 
AWS re:Invent Deep Learning: Goin Beyond Machine Learning (BDT311)
Chida Chidambaram
 
Integrating Machine Learning Capabilities into your team
Cameron Vetter
 
Data Summer Conf 2018, “Build, train, and deploy machine learning models at s...
Provectus
 
MLops workshop AWS
Gili Nachum
 
Machine Learning Startup
Ben Lackey
 
Ad

More from SC5.io (12)

PDF
Transfer learning with Custom Vision
SC5.io
 
PDF
Practical AI for Business: Bandit Algorithms
SC5.io
 
PDF
Decision trees & random forests
SC5.io
 
PDF
Bandit Algorithms
SC5.io
 
PDF
Angular.js Primer in Aalto University
SC5.io
 
PDF
Miten design-muutosjohtaminen hyödyttää yrityksiä?
SC5.io
 
PDF
Securing the client side web
SC5.io
 
PDF
Engineering HTML5 Applications for Better Performance
SC5.io
 
PDF
2013 10-02-backbone-robots-aarhus
SC5.io
 
PDF
2013 10-02-html5-performance-aarhus
SC5.io
 
PDF
2013 04-02-server-side-backbone
SC5.io
 
PPTX
Building single page applications
SC5.io
 
Transfer learning with Custom Vision
SC5.io
 
Practical AI for Business: Bandit Algorithms
SC5.io
 
Decision trees & random forests
SC5.io
 
Bandit Algorithms
SC5.io
 
Angular.js Primer in Aalto University
SC5.io
 
Miten design-muutosjohtaminen hyödyttää yrityksiä?
SC5.io
 
Securing the client side web
SC5.io
 
Engineering HTML5 Applications for Better Performance
SC5.io
 
2013 10-02-backbone-robots-aarhus
SC5.io
 
2013 10-02-html5-performance-aarhus
SC5.io
 
2013 04-02-server-side-backbone
SC5.io
 
Building single page applications
SC5.io
 
Ad

Recently uploaded (20)

PDF
July Patch Tuesday
Ivanti
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
July Patch Tuesday
Ivanti
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Advancing WebDriver BiDi support in WebKit
Igalia
 

AWS Machine Learning & Google Cloud Machine Learning

  • 1. AWS & GOOGLE MACHINE LEARNING SERVICES 11.10.2017 Max Pagels, Machine Learning Specialist @maxpagels, linkedin.com/in/maxpagels/
  • 2. ABOUT ME • BSc, MSc in Computer Science (University of Helsinki) • Currently doing applied machine learning at SC5, a consultancy based in Helsinki • Former CS researcher • Current favourite ML algorithm: LSTM neural networks • Favourite programming languages: JavaScript for full-stack, Python for ML • Other hats I wear: full-stack developer, technical interviewer
  • 3. AGENDA Part 1: Overview of AWS & Google machine learning services 〜 〜 〜 Part 2: Classification demo using Google ML Engine and AWS ML 〜 〜 〜 Part 3 (time permitting): Free-form Q&A
  • 4. IF YOU HAVE ANY QUESTIONS, AT ANY TIME, DO ASK!
  • 6. MACHINE LEARNING LEARNS FROM DATA TO SOLVE COMPLEX TASKS In contrast to traditional programming, machine learning learns from data and automatically produces a program to solve a task. That makes it suitable for tasks that are impossible or infeasible to code by hand, as well as automation of things that are tedious but (until now) required human supervision. ML is the foundation of modern artificial intelligence.
  • 8. TRADITIONAL PROGRAMMING Input → Algorithm → Output
  • 9. MACHINE LEARNING Input & output → Learning algorithm → Program
  • 10. THERE ARE 3 MAIN TYPES OF MACHINE LEARNING SUPERVISED LEARNING Learn from labelled data to build a model for predicting a number (regression) or a a discrete class (classification) on new data UNSUPERVISED LEARNING Find structure in unlabelled data to provide insights REINFORCEMENT LEARNING Take actions in the world, receive rewards, and learn to maximise reward over time
  • 11. MACHINE LEARNING IS RESOURCE- INTENSIVE ML models learn from data using numerical optimisation and linear algebra. It’s not uncommon for each pass over training data to require thousands or even millions of mathematical operations. Inference (predicting/classifying new examples) is also expensive.
  • 12. THE CLOUD OFFERS CPU/GPU COMPUTE ON-DEMAND In addition to infrastructure-as-a-service, the big cloud vendors (Azure, Google, AWS) also offer a number of managed and hybrid ML & AI solutions.
  • 13. LET’S TAKE A CLOSER LOOK AT THE SERVICES PROVIDED BY GOOGLE AND AMAZON (AWS)
  • 15. From their website: “Google Cloud's AI provides modern machine learning services, with pre-trained models and a service to generate your own tailored models. Our neural net-based ML service has better training performance and increased accuracy compared to other large scale deep learning systems. Our services are fast, scalable and easy to use. Major Google applications use Cloud machine learning, including Photos (image search), the Google app (voice search), Translate, and Inbox (Smart Reply). Our platform is now available as a cloud service to bring unmatched scale and speed to your business applications.”
  • 16. BREAKDOWN OF GOOGLE AI SERVICES FULLY MANAGED APIS • Cloud Jobs: machine learning-powered job search engine • Cloud Video Intelligence: extract metadata, identify key nouns, and automatically annotate the content of videos using a REST API • Cloud Vision: image classification, object recognition and OCR as-a-service • Cloud Speech: audio-to-text • Natural Language: extract information about people, places, events etc. Sentiment analysis supported • Cloud Translation: language translation á la Google Translate HYBRID/BYO • Machine Learning Engine: general- purpose machine learning training and inference engine • Implement your own learning algorithms in TensorFlow, provide training data, train in the cloud without worrying about servers • Deploy trained models in the cloud for a scalable server less prediction API • Can also do training and/or inference on your local machine
  • 18. From their website: “Within AWS, we’re focused on bringing that knowledge and capability to you through three layers of the AI stack: Frameworks and Infrastructure with tools like Apache MXNet and TensorFlow, API-driven Services to quickly add intelligence to applications, and Machine Learning Platforms for data scientists.”
  • 19. BREAKDOWN OF AWS AI SERVICES FULLY MANAGED APIS • Amazon Lex: natural language understanding and speech recognition, powered by the same AIs used in Alexa • Amazon Polly: text-to-speech as-a- service • Amazon Rekognition: ready-made image recognition, object recognition, and OCR FULLY MANAGED SERVICES • Amazon Machine Learning: linear and logistic regression as-a-service • Provide data, choose learning algorithm, train in the cloud • Deploy a trained model as a prediction API in the cloud BYO/PLATFORM SERVICES • Amazon EMR: Managed Hadoop/Spark environment, implement your algorithms/training/inference yourself • Amazon Deep Learning AMIs: spin up instances on EC2 preinstalled with TensorFlow, MXnet, Theano, Caffe, CNTK, Torch etc. and handle the rest yourself • Amazon EC2: spin up instances with the CPU/GPU power you require and install whatever you like
  • 20. A NOTE ON PRICING
  • 21. Cloud services are typically pay-as-you-go/pay for what you use. For machine learning, that usually means you pay for time/resources needed to train, and time needed to do predictions. Unless you use a service that explicitly spins up hardware and keeps it running, you typically don’t pay anything if you aren’t doing training/inference. PRICING EXAMPLE (CLOUD INFRASTRUCTURE)
  • 22. PRICING EXAMPLE (CLOUD INFRASTRUCTURE) Example: on AWS EC2, a p2.8xlarge instance has: • 32 vCPUs • 488 GiB RAM • 8 NVIDIA K80 GPUs, 2,496 PPCs and 12GiB of GPU memory per GPU
  • 23. PRICING EXAMPLE (CLOUD INFRASTRUCTURE) Cost of buying one K80 yourself: 5,000 € Cost of buying the equivalent hardware yourself: 50,000 € Cost of running the instance in AWS: about 8 € per hour 50,000 € equals 260 consecutive days of p2.8xlarge use
  • 25. MULTI-CLASS CLASSIFICATION USING GOOGLE ML ENGINE & AWS MACHINE LEARNING
  • 26. THE IRIS DATASET There are lots of freely available ML datasets online (for example on Kaggle). One of them is the legendary Iris flower dataset. It includes three iris species with 50 samples each as well as some properties (features) about each flower: sepal length, sepal width, petal length & petal width (in cm). With Google ML engine and AWS Machine learning, we are going to train an ML model on the iris data to build a classifier that can correctly classify new examples as one of three iris species (classes).
  • 27. LEARNING ALGORITHM: LOGISTIC REGRESSION Logistic regression is a simple learning algorithm. It’s similar to linear regression, but meant for classification problems.
  • 28. LOGISTIC REGRESSION OVERVIEW 1. Assume a linear relationship between features: L = w₁ * sepal_width + w₂ * sepal_length + w₃ * petal_width + w₄ * petal_length 2.Use a sigmoid function to convert the result to a probability of belonging to a class: H(L) = sigmoid(L) = 1 / (1 + e^(-L)) 3.Build 1) & 2) for each possible class 4.Iterate over our dataset, construct H(L) for each example, check how far we were from the correct class (we have the correct answers in our labelled dataset) 5.Adjust weights w₁, w₂, w₃ and w₄ so that, on average, we get things less wrong next time (note: use partial derivatives) 6.Iterate 4)-5) until we achieve good accuracy (i.e. classify as many examples correctly as possible) 7. Stop iterating when accuracy is “good enough” or after some predetermined number of iterations
  • 29. REMEMBER: G-I-G-O GIGO stands for “Garbage in, garbage out”. Without quality, cleaned source data machine learning won’t work well. Some estimates say that feature engineering & data cleaning account for 80% of data scentists’ work
  • 32. Google Cloud Machine Learning Engine AWS Machine Learning Service type Hybrid Fully managed Supported algorithms Linear and non-linear learners (DNNs, linear & logistic regression, Bayesian learners etc.) Only linear learners (linear & logistic regression) Algorithm implementation BYO: Build using Tensorflow (low-level API or Estimators) or Keras (tf.contrib.keras) Pre-defined (linear & logistic multi-class regression) Accepted data sources Google Cloud Storage, BigTable & other Google Cloud platform storage services S3 (CSV-formatted data), RedShift Built-in data transformation tools Full control (TensorFlow functionality + packaging of Python modules as dependencies) Limited, using “Recipes” (editable in the console UI) Model training In the cloud or locally In the cloud GPU support for training Yes No? Hyperparameter tuning Full control + automatic tuning Limited manual tuning (regularisation, epochs) Cross-validation Yes, configurable Yes (configurable train/test sets but no K-fold) Model versioning Explicit Implicit Underlying computation engine TensorFlow AWS EMR (Spark MLlib?) Real-time predictions Yes, using Cloud Engine Prediction API Yes (built-in) Batch predictions Yes, using Cloud Engine Prediction API Yes (built-in) Monitoring Yes (Training jobs console, TensorBoard) Yes (CloudWatch and AWS ML UI)
  • 33. WHAT WE SAW WAS TWO SIMPLE DEMOS…
  • 34. BUT THE POSSIBILITIES ARE ENDLESS