SlideShare a Scribd company logo
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
achine Learning
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
What is Machine Learning?
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
What exactly is Machine Learning?
“Machine learning enables
computers to learn from data
and experiences and to act
without being explicitly
programmed.”
130 exabytes of data (in 2005)
• 2005 - 130 exabytes of data
• 2010 - 1200 exabytes
• 2015 - 7900 exabytes
• 2020 - 40900 exabytes
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
What is the goal
of Machine Learning?
“The goal of machine learning is to
program computers
to use example data or past
experience to solve a given problem.
Machine Learning in a Nutshell
Machine
learning
algorithm
Model
Application
Data
Contains
patterns
Finds
patterns
Recognizes
patterns
Provides new data to
see if it matches
known patterns
Finding Patterns: A Simple Example
Name
$2,600.45
$2,294.58
$1,003.30
$8,488.32
Amount Fraudulent
Smith
Potter
Peters
Adams
No
Yes
Yes
No
What’s the pattern for
fraudulent
transactions?
Finding Patterns: Another Example
$2,600.45
$2,294.58
$1,003.30
$8,488.32
Name Amount Fraudulent
Smith
Potter
Peters
Adams
No
Yes
Yes
No
Where
Issued
Where
Used Age
$200.12
$3,250.11
$8,156.20
$7,475.11
Pali
Jones
Hanford
Marx
USA
RUS
USA
FRA
AUS
USA
USA
UK
22
29
25
64
58
43
27
32
No
No
Yes
No
USA
USA
RUS
USA
JAP
RUS
UK
GER
$540.00
$7,475.11
Norse
Edson
USA
USA
27
20
No
Yes
ITA
RUS
What’s the
pattern for
fraudulent
transactions?
P(A) P(B|A)
P(B)
P(A|B) =
Deep
learning uses
this
Styles of ML Algorithms:
Examples
Terminology
Training data
The prepared data used to
create a model
Creating a model is called
training a model
Supervised learning
The value you want to
predict is in the training
data
The data is labeled
Unsupervised learning
The value you want to predict is
not in the training data
The data is unlabeled
The most common
approach
Categorizing ML Problems:
Regression
Goal: Predict a value
Example question:
Categorizing ML Problems:
Classification
Goal: Predict a class
Example question:
Class 1 Class 2
Can be more than
two classes
Categorizing ML Problems:
Clustering
Goal: Discover structure
Example question:
Cluster 1
Cluster 2
Cluster 3
Hey Machine!
What to learn?
Too complex:
(When you can’t code it….)
 NLP
 H-W recognition
 CV
Too specialized:
(When you have to adapt...)
 Amazon
 Netflix
 Predictive typing
Too much:
(When you can’t scale it….)
 Spam detection
 Fraud detection
 Healthcare
Too robotic:
(When you can’t track it….)
 AI gaming
 Robot control
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Collaborative
Filtering:
(Scenario)
 Lucy:
 Bob:
 John:
Content-Based
Filtering:
(Scenario)
 Lucy:
 Bob:
 John:
What to recommend to John?
Collaborative
Filtering:
 Not relying on machine analyzable content
 Similarity alg's: k-NN & Pearson
 Explicit vs. Implicit data collection
 • Explicit: rating movies after watch
 • Implicit: how often you watch movies,
how often you have watched season in
period of time.
 Used in: Facebook, MySpace, LinkedIn,
Twitter
 Challenges: Cold start, scalability, sparsity
 Algorithms: matrix factorization
Content-Based
Filtering:
 History: Information Retrieval
 Algorithms for abstracting: tf-idf
 Algorithms: Bayesian classifiers, cluster analysis,
decision trees, articial neural networks
 Scenarios: Netflix
Time for Action!
Deploy
chosen
model
Chosen
Model
Apply
learning
algorithm
to data
Candidate
Model
Prepared
Data
Apply pre-
processing
to data
Iterate to find the best
model
Data
Preprocessing
Modules
Iterate until data is
ready
Preprocessing
Modules
Machine
Learning
Algorithms
Applications
The goal:
Smarter
applications
Raw
Data
Raw
Data
Choose
data
The Machine Learning Process
Candidate
Model
Raw
Data
Raw
Data
Prepared
Data
Apply pre-
processing
to data
Deploy
chosen
model
Apply
learning
algorithm
to data
Chosen
Model
ML Studio
Preprocessing
Modules
Machine
Learning
Algorithms
Data
Preprocessing
Modules
Azure ML
API
Applications
Azure ML
Again
Azure ML Studio
Allows running
“experiments”
Candidate
Model
Raw
Dat
a
Raw
Dat
a
Prepared
Data
Apply pre-
processing
to data
Deploy
chosen
model
Apply learning
algorithm
to data
Chosen
Model
ML Studio
Preprocessing
Modules
Machine
Learning
Algorithms
Data
Preprocessing
Modules
Azure
ML
API
Azure ML Data Preprocessing
Example modules
Candidate
Model
Raw
Dat
a
Raw
Dat
a
Prepared
Data
Apply pre-
processing
to data
Deploy
chosen
model
Apply learning
algorithm
to data
Chosen
Model
Preprocessing
Modules
Machine
Learning
Algorithms
Data
Preprocessing
Modules
Azure
ML
API
Clean Missing Data
Removes or fills in missing
values in a dataset
Example: Replace each
missing value with the mean
of the other values in this
column
Select Columns in Dataset
Creates a view of a dataset
that includes or excludes
specific columns
Example: Delete a column
whose data is highly
correlated with data in
another column
Partition and Sample
Divides or extracts a subset of a
dataset
Example: Select a specific
number of rows from the data
There are dozens more
data preprocessing modules
Azure ML Learning
Example algorithms
Candidate
Model
Raw
Dat
a
Raw
Dat
a
Prepared
Data
Apply pre-
processing
to data
Deploy
chosen
model
Apply learning
algorithm
to data
Chosen
Model
Preprocessing
Modules
Machine
Learning
Algorithms
Data
Preprocessing
Modules
Azure
ML
API
Regression
Linear regression
Ordinal regression
Bayesian linear regression
Neural network regression
Decision forest regression
Boosted decision tree
regression
Classification
Two-class neural network
Two-class decision forest
Multiclass neural network
Multiclass decision forest
Multiclass decision jungle
Clustering
K-means
Azure ML is designed
for data scientists
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
Microsoft Azure
Azure ML API
Chosen
Model
Azure ML API
Deploying and using a model
1) Deploy chosen model
ML Studio 2) Call model with
values for features the
model requires
3) Get back value
predicted by the model
using those features
Application
Candidate
Model
Raw
Dat
a
Raw
Dat
a
Prepared
Data
Apply pre-
processing
to data
Deploy
chosen
model
Apply learning
algorithm
to data
Chosen
Model
Preprocessing
Modules
Machine
Learning
Algorithms
Data
Preprocessing
Modules
Azure
ML
API
Time for Something Different!
Scenario
Detecting credit card fraudMICROSOFTAZUREONPREMISES
Model
Azure
ML
Azure
Blobs
Historical Transaction
Data
Credit Card Customers
Payment
Application
Scenario
Predicting customer churnMICROSOFTAZUREONPREMISES
Call Center Staff
Call Center
Application
Model
Azure
ML
Azure
Blobs
Detailed
Call Data
ML Prep
Application
CRM
Data
Aggregated
Call Data
Data
for ML
Azure HDInsight
Aggregation
Application
Scenario
Predicting equipment failure
MICROSOFTAZURE
Azure ML
Model
Azure IoT
Hub
Streaming
Data
ONPREMISES
INTERNETCONNECTED
Maintenance
StaffDevices
Azure Stream Analytics,
HDInsight Storm,
Spark Streaming
Notification
Application
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”
The Reality of Machine Learning
No model is an island
Azure ML is commonly used
with other Azure technologies
For ingesting data
For storing data
For displaying data
More …
A complete solution often
contains many different parts
Which can make the story
complex for customers
Solution: Cortana Intelligence
Suite
A group of related Azure data
technologies for analytics and
intelligence
Cortana Intelligence Suite
Originally known as Cortana Analytics Suite
Cortana Intelligence Gallery
Allows access to ML APIs, e.g.,
Face API, Translator API, etc.
Offers example models
for many industries, e.g.,
retail, healthcare, etc.
Challenges for Machine Learning
Business Leaders
Want solutions to
business problems
Data Scientists
Want powerful, easy-
to-use tools
Software Developers
Want to create better
applications
Who can be interested in ML?
Finds patterns
in data.
 Lots od data
 Lots of computer power
 Effective algorithms.
Very well defined question.
• Choosing what
question to ask is
the most important
part of the process
• Ask yourself: Do
you have the right
data to answer this
question?
• Ask yourself: Do
you know how
you’ll measure
success?
? $ %
+
Iterate until you have a model that
makes good predictions.
“Make machine learning accessible to
every enterprise, data scientist, developer,
information worker, consumer, and device
anywhere in the world.”
 http://aka.ms/ml
 http://aka.ms/cntk
~ Stephen Hawking,
Web Summit 2017
Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”

More Related Content

What's hot (20)

PPTX
High time to add machine learning to your information security stack
Minhaz A V
 
PDF
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
Edge AI and Vision Alliance
 
PPTX
Introduction to Machine Learning
Rahul Jain
 
PPTX
Towards a Comprehensive Machine Learning Benchmark
Turi, Inc.
 
PDF
The Power of Auto ML and How Does it Work
Ivo Andreev
 
PDF
Machine Learning 101 - AWS Machine Learning Web Day
AWS Germany
 
PDF
ML Basics
SrujanaMerugu1
 
PDF
Mahout
Edureka!
 
PDF
Amazon Machine Learning im Einsatz: smartes Marketing - AWS Machine Learning...
AWS Germany
 
PPTX
Net campus2015 antimomusone
DotNetCampus
 
PDF
Artificial Intelligence for Automating Data Analysis
Manuel Martín
 
PPTX
Automated Machine Learning
safa cimenli
 
PPTX
Datascience and Azure(v1.0)
Zenodia Charpy
 
PPTX
Data Science on Azure
Zenodia Charpy
 
PDF
Automatic machine learning (AutoML) 101
QuantUniversity
 
PDF
Machine Learning Classifiers
Mostafa
 
PPTX
Introduction to Machine Learning
shivani saluja
 
PPTX
Introduction to Auto ML
Dmitry Petukhov
 
PPTX
Machine learning
eonx_32
 
PPTX
Ai use cases
Sparsh Agarwal
 
High time to add machine learning to your information security stack
Minhaz A V
 
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
Edge AI and Vision Alliance
 
Introduction to Machine Learning
Rahul Jain
 
Towards a Comprehensive Machine Learning Benchmark
Turi, Inc.
 
The Power of Auto ML and How Does it Work
Ivo Andreev
 
Machine Learning 101 - AWS Machine Learning Web Day
AWS Germany
 
ML Basics
SrujanaMerugu1
 
Mahout
Edureka!
 
Amazon Machine Learning im Einsatz: smartes Marketing - AWS Machine Learning...
AWS Germany
 
Net campus2015 antimomusone
DotNetCampus
 
Artificial Intelligence for Automating Data Analysis
Manuel Martín
 
Automated Machine Learning
safa cimenli
 
Datascience and Azure(v1.0)
Zenodia Charpy
 
Data Science on Azure
Zenodia Charpy
 
Automatic machine learning (AutoML) 101
QuantUniversity
 
Machine Learning Classifiers
Mostafa
 
Introduction to Machine Learning
shivani saluja
 
Introduction to Auto ML
Dmitry Petukhov
 
Machine learning
eonx_32
 
Ai use cases
Sparsh Agarwal
 

Similar to Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python” (20)

PDF
AWS re:Invent Deep Learning: Goin Beyond Machine Learning (BDT311)
Chida Chidambaram
 
PPTX
Azure Machine Learning Challenge_Speakers Presentation.pptx
DrSatwinderSingh3
 
PPTX
Azure Machine Learning and its real-world use cases
Michaela Murray
 
PPTX
ML basics.pptx
PriyadharshiniG41
 
PPTX
Machine learning
Saravanan Subburayal
 
PDF
C19013010 the tutorial to build shared ai services session 1
Bill Liu
 
PPTX
MCT Summit Azure automated Machine Learning
Usama Wahab Khan Cloud, Data and AI
 
PPTX
Fedarated learning
VaishakhKP1
 
PDF
Federated Learning
Kritika942072
 
PDF
Artificial Intelligence (ML - DL)
ShehryarSH1
 
PPTX
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
DotNetCampus
 
PPTX
AI-ML-Virtual-Internship on new technology
AnubhavKumar615216
 
PPT
Part1
sumit621
 
PPTX
IBM Meetup on November 1, 2018: Machine Learning made easy with Watson Studio
Svetlana Levitan, PhD
 
PDF
Machine Learning as a Service with Amazon Machine Learning
Julien SIMON
 
PPTX
Machine Learning 2 deep Learning: An Intro
Si Krishan
 
PDF
IBM i & Data Science in the AI era.
Benoit Marolleau
 
PDF
DEVOPS AND MACHINE LEARNING
CodeOps Technologies LLP
 
PPTX
Introduction to DS, ML and IBM Tools
Qamar un Nisa
 
PDF
Using Complex Event Processing for Modeling Semantic Requests in Real-Time So...
dominikriemer
 
AWS re:Invent Deep Learning: Goin Beyond Machine Learning (BDT311)
Chida Chidambaram
 
Azure Machine Learning Challenge_Speakers Presentation.pptx
DrSatwinderSingh3
 
Azure Machine Learning and its real-world use cases
Michaela Murray
 
ML basics.pptx
PriyadharshiniG41
 
Machine learning
Saravanan Subburayal
 
C19013010 the tutorial to build shared ai services session 1
Bill Liu
 
MCT Summit Azure automated Machine Learning
Usama Wahab Khan Cloud, Data and AI
 
Fedarated learning
VaishakhKP1
 
Federated Learning
Kritika942072
 
Artificial Intelligence (ML - DL)
ShehryarSH1
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
DotNetCampus
 
AI-ML-Virtual-Internship on new technology
AnubhavKumar615216
 
Part1
sumit621
 
IBM Meetup on November 1, 2018: Machine Learning made easy with Watson Studio
Svetlana Levitan, PhD
 
Machine Learning as a Service with Amazon Machine Learning
Julien SIMON
 
Machine Learning 2 deep Learning: An Intro
Si Krishan
 
IBM i & Data Science in the AI era.
Benoit Marolleau
 
DEVOPS AND MACHINE LEARNING
CodeOps Technologies LLP
 
Introduction to DS, ML and IBM Tools
Qamar un Nisa
 
Using Complex Event Processing for Modeling Semantic Requests in Real-Time So...
dominikriemer
 
Ad

More from Lviv Startup Club (20)

PDF
Maksym Vyshnivetskyi: Управління закупівлями (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління ризиками (UA)
Lviv Startup Club
 
PPTX
Dmytro Zubkov: PMO Resource Management (UA)
Lviv Startup Club
 
PPTX
Rostyslav Chayka: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
PDF
Maksym Vyshnivetskyi: Управління якістю (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Робота зі стейкхолдерами за допомогою AI (UA)
Lviv Startup Club
 
PDF
Maksym Vyshnivetskyi: Управління вартістю (Cost) (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Lviv Startup Club
 
PPTX
Dmytro Liesov: Developing PMO Services and Functions (UA)
Lviv Startup Club
 
PDF
Igor Dumbur: Інженерна досконалість та DevOps (UA)
Lviv Startup Club
 
PDF
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Lviv Startup Club
 
PPTX
Dmytro Liesov: Управління інтеграцією (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Управління обсягом (Scope) (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Defining PMO Structure and Governance (UA)
Lviv Startup Club
 
PDF
Oleksandra Apanasenkova: Управління delivery (Частина 2) (UA)
Lviv Startup Club
 
PDF
Michael Vidyakin: Планування проєктів за допомогою AI (UA)
Lviv Startup Club
 
PDF
Oleksandr Osypenko: Комунікації у проєкті (UA)
Lviv Startup Club
 
PDF
Oleksandra Apanasenkova: Управління delivery (Частина 1) (UA)
Lviv Startup Club
 
PPTX
Dmytro Liesov: Лідерство та управління конфліктами (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління закупівлями (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління ризиками (UA)
Lviv Startup Club
 
Dmytro Zubkov: PMO Resource Management (UA)
Lviv Startup Club
 
Rostyslav Chayka: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
Ihor Pavlenko: Комунікація за допомогою AI (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління якістю (UA)
Lviv Startup Club
 
Ihor Pavlenko: Робота зі стейкхолдерами за допомогою AI (UA)
Lviv Startup Club
 
Maksym Vyshnivetskyi: Управління вартістю (Cost) (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Lviv Startup Club
 
Dmytro Liesov: Developing PMO Services and Functions (UA)
Lviv Startup Club
 
Igor Dumbur: Інженерна досконалість та DevOps (UA)
Lviv Startup Club
 
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Lviv Startup Club
 
Dmytro Liesov: Управління інтеграцією (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Управління обсягом (Scope) (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Defining PMO Structure and Governance (UA)
Lviv Startup Club
 
Oleksandra Apanasenkova: Управління delivery (Частина 2) (UA)
Lviv Startup Club
 
Michael Vidyakin: Планування проєктів за допомогою AI (UA)
Lviv Startup Club
 
Oleksandr Osypenko: Комунікації у проєкті (UA)
Lviv Startup Club
 
Oleksandra Apanasenkova: Управління delivery (Частина 1) (UA)
Lviv Startup Club
 
Dmytro Liesov: Лідерство та управління конфліктами (UA)
Lviv Startup Club
 
Ad

Recently uploaded (20)

PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PDF
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
PDF
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PPTX
big data eco system fundamentals of data science
arivukarasi
 
PPTX
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
PDF
Optimizing Large Language Models with vLLM and Related Tools.pdf
Tamanna36
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PDF
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
PDF
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
apidays Helsinki & North 2025 - APIs at Scale: Designing for Alignment, Trust...
apidays
 
PPTX
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
PPTX
How to Add Columns and Rows in an R Data Frame
subhashenia
 
PDF
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PPTX
Powerful Uses of Data Analytics You Should Know
subhashenia
 
PPTX
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
Research Methodology Overview Introduction
ayeshagul29594
 
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
big data eco system fundamentals of data science
arivukarasi
 
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
Optimizing Large Language Models with vLLM and Related Tools.pdf
Tamanna36
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
apidays Helsinki & North 2025 - APIs at Scale: Designing for Alignment, Trust...
apidays
 
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
How to Add Columns and Rows in an R Data Frame
subhashenia
 
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
Powerful Uses of Data Analytics You Should Know
subhashenia
 
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 

Borys Rybak “Azure Machine Learning Studio & Azure Workbench & R + Python”

  • 8. What is Machine Learning?
  • 10. What exactly is Machine Learning?
  • 11. “Machine learning enables computers to learn from data and experiences and to act without being explicitly programmed.”
  • 12. 130 exabytes of data (in 2005) • 2005 - 130 exabytes of data • 2010 - 1200 exabytes • 2015 - 7900 exabytes • 2020 - 40900 exabytes
  • 14. What is the goal of Machine Learning?
  • 15. “The goal of machine learning is to program computers to use example data or past experience to solve a given problem.
  • 16. Machine Learning in a Nutshell Machine learning algorithm Model Application Data Contains patterns Finds patterns Recognizes patterns Provides new data to see if it matches known patterns
  • 17. Finding Patterns: A Simple Example Name $2,600.45 $2,294.58 $1,003.30 $8,488.32 Amount Fraudulent Smith Potter Peters Adams No Yes Yes No What’s the pattern for fraudulent transactions?
  • 18. Finding Patterns: Another Example $2,600.45 $2,294.58 $1,003.30 $8,488.32 Name Amount Fraudulent Smith Potter Peters Adams No Yes Yes No Where Issued Where Used Age $200.12 $3,250.11 $8,156.20 $7,475.11 Pali Jones Hanford Marx USA RUS USA FRA AUS USA USA UK 22 29 25 64 58 43 27 32 No No Yes No USA USA RUS USA JAP RUS UK GER $540.00 $7,475.11 Norse Edson USA USA 27 20 No Yes ITA RUS What’s the pattern for fraudulent transactions?
  • 19. P(A) P(B|A) P(B) P(A|B) = Deep learning uses this Styles of ML Algorithms: Examples
  • 20. Terminology Training data The prepared data used to create a model Creating a model is called training a model Supervised learning The value you want to predict is in the training data The data is labeled Unsupervised learning The value you want to predict is not in the training data The data is unlabeled The most common approach
  • 21. Categorizing ML Problems: Regression Goal: Predict a value Example question:
  • 22. Categorizing ML Problems: Classification Goal: Predict a class Example question: Class 1 Class 2 Can be more than two classes
  • 23. Categorizing ML Problems: Clustering Goal: Discover structure Example question: Cluster 1 Cluster 2 Cluster 3
  • 25. Too complex: (When you can’t code it….)  NLP  H-W recognition  CV Too specialized: (When you have to adapt...)  Amazon  Netflix  Predictive typing Too much: (When you can’t scale it….)  Spam detection  Fraud detection  Healthcare Too robotic: (When you can’t track it….)  AI gaming  Robot control
  • 27. Collaborative Filtering: (Scenario)  Lucy:  Bob:  John: Content-Based Filtering: (Scenario)  Lucy:  Bob:  John: What to recommend to John?
  • 28. Collaborative Filtering:  Not relying on machine analyzable content  Similarity alg's: k-NN & Pearson  Explicit vs. Implicit data collection  • Explicit: rating movies after watch  • Implicit: how often you watch movies, how often you have watched season in period of time.  Used in: Facebook, MySpace, LinkedIn, Twitter  Challenges: Cold start, scalability, sparsity  Algorithms: matrix factorization Content-Based Filtering:  History: Information Retrieval  Algorithms for abstracting: tf-idf  Algorithms: Bayesian classifiers, cluster analysis, decision trees, articial neural networks  Scenarios: Netflix
  • 30. Deploy chosen model Chosen Model Apply learning algorithm to data Candidate Model Prepared Data Apply pre- processing to data Iterate to find the best model Data Preprocessing Modules Iterate until data is ready Preprocessing Modules Machine Learning Algorithms Applications The goal: Smarter applications Raw Data Raw Data Choose data The Machine Learning Process
  • 31. Candidate Model Raw Data Raw Data Prepared Data Apply pre- processing to data Deploy chosen model Apply learning algorithm to data Chosen Model ML Studio Preprocessing Modules Machine Learning Algorithms Data Preprocessing Modules Azure ML API Applications Azure ML Again
  • 32. Azure ML Studio Allows running “experiments” Candidate Model Raw Dat a Raw Dat a Prepared Data Apply pre- processing to data Deploy chosen model Apply learning algorithm to data Chosen Model ML Studio Preprocessing Modules Machine Learning Algorithms Data Preprocessing Modules Azure ML API
  • 33. Azure ML Data Preprocessing Example modules Candidate Model Raw Dat a Raw Dat a Prepared Data Apply pre- processing to data Deploy chosen model Apply learning algorithm to data Chosen Model Preprocessing Modules Machine Learning Algorithms Data Preprocessing Modules Azure ML API Clean Missing Data Removes or fills in missing values in a dataset Example: Replace each missing value with the mean of the other values in this column Select Columns in Dataset Creates a view of a dataset that includes or excludes specific columns Example: Delete a column whose data is highly correlated with data in another column Partition and Sample Divides or extracts a subset of a dataset Example: Select a specific number of rows from the data There are dozens more data preprocessing modules
  • 34. Azure ML Learning Example algorithms Candidate Model Raw Dat a Raw Dat a Prepared Data Apply pre- processing to data Deploy chosen model Apply learning algorithm to data Chosen Model Preprocessing Modules Machine Learning Algorithms Data Preprocessing Modules Azure ML API Regression Linear regression Ordinal regression Bayesian linear regression Neural network regression Decision forest regression Boosted decision tree regression Classification Two-class neural network Two-class decision forest Multiclass neural network Multiclass decision forest Multiclass decision jungle Clustering K-means Azure ML is designed for data scientists
  • 38. Microsoft Azure Azure ML API Chosen Model Azure ML API Deploying and using a model 1) Deploy chosen model ML Studio 2) Call model with values for features the model requires 3) Get back value predicted by the model using those features Application Candidate Model Raw Dat a Raw Dat a Prepared Data Apply pre- processing to data Deploy chosen model Apply learning algorithm to data Chosen Model Preprocessing Modules Machine Learning Algorithms Data Preprocessing Modules Azure ML API
  • 39. Time for Something Different!
  • 40. Scenario Detecting credit card fraudMICROSOFTAZUREONPREMISES Model Azure ML Azure Blobs Historical Transaction Data Credit Card Customers Payment Application
  • 41. Scenario Predicting customer churnMICROSOFTAZUREONPREMISES Call Center Staff Call Center Application Model Azure ML Azure Blobs Detailed Call Data ML Prep Application CRM Data Aggregated Call Data Data for ML Azure HDInsight Aggregation Application
  • 42. Scenario Predicting equipment failure MICROSOFTAZURE Azure ML Model Azure IoT Hub Streaming Data ONPREMISES INTERNETCONNECTED Maintenance StaffDevices Azure Stream Analytics, HDInsight Storm, Spark Streaming Notification Application
  • 44. The Reality of Machine Learning No model is an island Azure ML is commonly used with other Azure technologies For ingesting data For storing data For displaying data More … A complete solution often contains many different parts Which can make the story complex for customers Solution: Cortana Intelligence Suite A group of related Azure data technologies for analytics and intelligence
  • 45. Cortana Intelligence Suite Originally known as Cortana Analytics Suite
  • 46. Cortana Intelligence Gallery Allows access to ML APIs, e.g., Face API, Translator API, etc. Offers example models for many industries, e.g., retail, healthcare, etc.
  • 48. Business Leaders Want solutions to business problems Data Scientists Want powerful, easy- to-use tools Software Developers Want to create better applications Who can be interested in ML?
  • 50.  Lots od data  Lots of computer power  Effective algorithms.
  • 51. Very well defined question.
  • 52. • Choosing what question to ask is the most important part of the process • Ask yourself: Do you have the right data to answer this question? • Ask yourself: Do you know how you’ll measure success? ? $ % +
  • 53. Iterate until you have a model that makes good predictions.
  • 54. “Make machine learning accessible to every enterprise, data scientist, developer, information worker, consumer, and device anywhere in the world.”
  • 56. ~ Stephen Hawking, Web Summit 2017

Editor's Notes

  • #2: Thanks for invitation, excitement about city, venue and confernce
  • #3: Tell what will be said. --> Machine Learning - Why? How? What? --> Machine Learning in action - Short Demo (basic algorithm, Python or R) --> Azure Machine Learning - Why? How? What? --> Azure Machine Learning Studio - showcase (demo1, demo2, demo3, summary) --> Azure Machine Learning Workbench - showcase (demo1, demo2, summary)
  • #4: Introduction of myself (background, profession )
  • #5: Introduction of myself (personal) -> Ask Questions for better communication and knowledge about the public.
  • #6: How many of you already tried Machine Learning? (basics scenario, complex one) Which one basics? So the rest complex? So how many of you want to find out something new about Machine Learning and maybe Learn about it?
  • #12: Customers can build Artificial Intelligence (AI) applications that intelligently: sense, process, and act on information (augmenting human capabilities, increasing speed and efficiency, and helping organizations achieve more)
  • #13: Literka 'A' = 1 byte Strona z literami = 1000x Literka 'A' = 1 kilobyte Ksiazka ze stronami = 1000x strona z literkami (500 obustronnie) = 1 megabyte Human genome 1000x 1 megabyte (you can encode a whole human being on) = 1 gigabyte Putting every second of human life for 80 years in HD video = 1 terabyte 1.4 miliarda akrów (1 acre = ~4k m^2), kazdy akr ma ~500 drzew, wiec w sumie 700 miliardow drzew w lasach tropikalnych Amazonii- jesli wiec wytnie sie wszystkie drzewa i zamieni na kartki papieru, umiesci na kazdej kartce po literce 'A' x1000 to bedzie sie miec od 1 do 2 petabyte danych. [zdjecie ziemii] jesli pomnozymy to x1000 to bedziemy miec exabyte.
  • #16: ~Introduction to Machine Learning, 2nd Edition, MIT Press
  • #26: Too complex: When you can't code it. (NLP, hand-writing recognition, Computer Vision…) Too much: When you can't scale it. (e.g. Spam & fraud detection, healthcare) Too specialized: When you have to adapt/personalize. (Amazon, Netflix, predictive typing) Autonomous: When you can't track it. (AI gaming, robot control)
  • #27: Netflix Example Movies already watched Same actor Same director Period of time (watched whole season per night or whole series) Basic Recommenders Systems - Collaborative Filtering Scanario - Content-Based Filtering Scenario The Netflix Prize was an open competition for the best collaborative filtering algorithm to predict user ratings for films, based on previous ratings without any other information about the users or films, i.e. without the users or the films being identified except by numbers assigned for the contest. The competition was held by Netflix, an online DVD-rental and video streaming service, and was open to anyone who is neither connected with Netflix (current and former employees, agents, close relatives of Netflix employees, etc.) nor a resident of certain blocked countries (such as Cuba or North Korea).[1] On September 21, 2009, the grand prize of US$1,000,000 was given to the BellKor's Pragmatic Chaos team which bested Netflix's own algorithm for predicting ratings by 10.06%.[2]
  • #28: What Music Do I Want to Listen to?   last.fm: Collaborative filtering User's past behavior Cold start   PANDORA: Content-based filtering Characteristics of items, i.e. properties of songs, artists,… Little information needed at the beginning
  • #29: TFIDFIn - information retrieval, tf–idf or TFIDF, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.
  • #30: Targeted Marketing Collaborative Filtering Collaborative + Content-based Filtering = Hybrid Recommender
  • #40: Azure ML Workbench Azure for Students Showing Portal (AZURE) Showing DSVM + DLVM Showing demo Workbench Why Deep Learning Virtual Machine? Increasingly, deep learning algorithms / deep neural networks are becoming one of the popular methods employed in many machine learning problems. They are especially good at machine cognition tasks like image, text, audio/video understanding often approaching human cognitive levels in some specific domains with advanced deep neural network architectures and access to large set of data to train models. Deep learning requires large amount of computational power to train models with these large datasets. With the cloud and availability of Graphical Processing Units (GPUs), it is becoming possible to build sophisticated deep neural architectures and train them on a large data set on powerful computing infrastructure on the cloud. The Data Science Virtual Machine has provided a rich set of tools and samples for data preparation, machine learning, and deep learning. But one of the challenges faced by the users is to discover the tools and samples for specific scenarios like deep learning easily and also more easily provision GPU-based VM instances. This Deep Learning Virtual Machine (DLVM) addresses these challenges. What is Deep Learning Virtual Machine? The Deep Learning Virtual Machine is a specially configured variant of the Data Science Virtual Machine (DSVM) to make it more straightforward to use GPU-based VM instances for training deep learning models. It is supported on Windows 2016 and the Ubuntu Data Science Virtual Machine. It shares the same core VM images (and hence all the rich toolset) as the DSVM but is configured to make deep learning easier. We also provide end-to-end samples for image and text understanding, that are broadly applicable to many real life AI scenarios. The deep learning virtual machine also tries to make the rich set of tools and samples on the DSVM more easily discoverable by surfacing a catalog of the tools and samples on the virtual machine. In terms of the tooling, the Deep Learning Virtual Machine provides several popular deep learning frameworks, tools to acquire and pre-process image, textual data. For a comprehensive list of tools, you can refer to the Data Science Virtual Machine Overview Page.
  • #42: Simply put, customer churn occurs when customers or subscribers stop doing business with a company or service. Also known as customer attrition, customer churn is a critical metric because it is much less expensive to retain existing customers than it is to acquire new customers
  • #48: Challenges for ML -Skilled Data Scientists Like unicorns, PhD in maths, collaborative, cs, statistics background, business approach understanding. -Infrastructure Cheaper to store data, you can scale it – cloud -Time Running algorithms at classes so long (leaving laptop by night) - with cloud you can increment. -Global scalable How to scale those algorithms globaly
  • #50: What Machine Learning Does? Finds patterns in data Then uses those patterns to predict the future Examples: Detecting credit card fraud Determining whether a customer is likely to switch to a competitor Deciding when to do preventive maintenance on a factory robot
  • #51: Why is Machine Learning So Hot? Doing machine learning well requires: Lots of data Lots of compute power Effective machine learning algorithms All of those things are now more available than ever
  • #52: What Do I Need for Data Science? Very well defined question / business objective "which of my machines / is this part of my machine going to fail within the next two weeks?" //what is the probability of failing in next two weeks? You need to have data - Relevant data Data that is relevant to the business objective That data need to be accurate - Accurate data Connected data Large data It is not necessarily Big Data, but large is good.
  • #53: What Do I Need for Data Science? Very well defined question / business objective "which of my machines / is this part of my machine going to fail within the next two weeks?" //what is the probability of failing in next two weeks? You need to have data - Relevant data Data that is relevant to the business objective That data need to be accurate - Accurate data Connected data Large data It is not necessarily Big Data, but large is good.
  • #54: Machine learning has its own jargon Such as training data, features, and supervised learning Machine learning problems often fall into three areas: Regression Classification Clustering Many different types of machine learning algorithms are used today Using these algorithms to create good models requires effort
  • #55: Azure Machine Learning Machine learning isn’t hard to understand Although it can be hard to do well Azure ML is a strong offering Especially for data scientists Machine learning can probably help your customers
  • #57: Watch Stephen Hawking speak about the future of AI at Web Summit.