Microsoft Technologies
for Data Science
Mark Tabladillo, Ph.D.
Solution Architect (Data Scientist)
Microsoft
August 2016: SQL Saturday Columbus GA
Networking
Interactive















Terms Definition
Data Science
Machine Learning
Data Mining
Applied Statistics
the automated or semi-
automated process of
discovering patterns in
data
Applied scientific method
http://www.kdnuggets.com/polls/2015/analytics-
data-mining-data-science-software-used.html
http://products.office.com/en-us/excel
http://www.microsoft.com/en-
us/server-cloud/products/sql-server/
http://pytools.codeplex.com/
http://azure.microsoft.com/en-
us/services/hdinsight/
http://www.revolutionanalytics.com/
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608

Technology Choices
SQL SERVER ANALYSIS SERVICES Enterprise
Business Intelligence
EXCEL ADD-IN FOR SSAS Office 365
Office 2013 or Higher x64
SEMANTIC SEARCH Enterprise
Business Intelligence
Standard
Web
Express with Advanced Services
MICROSOFT AZURE ML Free (Size Limited)
Paid (Web Service): Experiment + Query
F# Open Source
SQL SERVER R SERVICES SQL Server 2016 or higher
Microsoft Data Science Technologies 201608
http://download.microsoft.com/download/F/C/2/FC21C981-
4351-4434-A78A-
3384CA7515BF/SQL_Server_2016_Deeper_Insights_Across_D
ata_White_Paper.pdf
SS
SQL
AS
NoSQL
Microsoft Data Science Technologies 201608
Data mining add-in for business
analysts
• Ease of use
• Rich data mining
• Scalable
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
Rowset
Output
with Scores
Varchar
NVarchar
Office
PDF
Documents
Full-Text
Keyword
Index
“FTI”
iFilters
Semantic Document
Similarity Index “DSI”
Semantic
Database
Semantic
Key Phrase
Index –
Tag Index
“TI”
Simplified Chinese
British English
Portuguese
Chinese (Hong Kong SAR, PRC)
Spanish
Chinese (Singapore)
Chinese (Macau SAR)
Time in Seconds vs. Number of Documents
(2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)
http://users.cis.fiu.edu/~lzhen001/activities/KDD2011Program/docs/p213.pdf
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
Features
Microsoft R Open
R Distribution (Free)
Microsoft R Client
Free
Microsoft R Server
Commercial
Big Data
In-memory bound
Can only process datasets that fit
into the available memory
In-memory bound
Can process datasets that fit into the available
memory
Operates on large volumes when connected
to R Server
Disk scalability
Operates on bigger volumes &
factors
Speed of
Analysis
Multi-threaded when MKL is
installed for non-ScaleR functions
Multi-threaded with MKL for non-ScaleR
functions
Up to 2 threads for ScaleR functions with a
local compute context
Full parallel threading &
processing
Enterprise
Readiness
Community support Community support Commercial support
Analytic
Breadth
& Depth
8000+ open source packages
Leverage & optimize open source R packages
plus 'Big Data'-ready ScaleR packages
Leverage & optimize open source
R packages plus 'Big Data'-ready
+ Multithreaded ready ScaleR
packages
Commercial
Viability
Risk of deployment to open
source
Free for everyone Commercial licenses
DeployR
Enterprise
Not available Not available Included
Microsoft R Server Editions Description Install ScaleR Get Started
R Server for Hadoop
Scale your analysis transparently
by distributing work across
nodes without complex
programming
Doc Doc
R Server for Teradata DB
Run advanced analytics in-
database for seamless data
analysis
Doc Doc
R Server for Linux
Bring predictive and prescriptive
analytics power to your Linux
environments
Doc Doc
 http://datacamp.com

Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
Mutable Immutable
Classic Open
Source
Java Scala
.NET
Now Open Source
C#, C++,
VB.NET
F#
Microsoft Data Science Technologies 201608



Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
https://www.microsoft.com/en-us/cloud-platform/what-is-cortana-intelligence-suite
Capabilities Products
Preconfigured solutions •Business scenarios •Forecasting, churn, etc.
Intelligence
•Integration with Cortana
•Bot services
•Cognitive services
•Cortana
•Bot Framework
•Cognitive Services
Dashboards and visualizations •Dashboards and visualizations •Power BI
Machine learning and advanced
analytics
•Machine learning
•Hadoop
•Distributed analytics
•Complex event processing
•Machine Learning
•HDInsight (Data Lake service)
•Data Lake analytics
•Stream Analytics
Big data stores
•Big Data repository
•Elastic data warehouse
•Data Lake store, Blobs
•SQL Data Warehouse
Information management
•Data orchestration
•Data catalog
•Event ingestion
•Data Factory
•Data catalog
•Event Hubs

 https://github.com/jakevdp/sklearn_pycon2015
Microsoft Data Science Technologies 201608
 http://www.bing.com/explore/predicts
 https://techcrunch.com/2016/07/07/microsoft-now-helps-businesses-use-the-data-that-powers-bing-predicts/
Microsoft Data Science Technologies 201608
 https://academy.microso
ft.com/en-
US/professional-
degree/data-science/
 https://borntolearn.msle
arn.net/b/weblog/posts/
announcing-the-
microsoft-professional-
degree-mpd-program
http://www.kdnuggets.com/2015/09/free-data-science-
books.html

https://channel9.msdn.com/Blogs/Windows-Azure

https://mva.microsoft.com/



http://blogs.technet.com/b/machinelearning/
http://social.msdn.microsoft.com/forums/azure/en-
US/home?forum=MachineLearning
http://sqlserverdatamining.com
http://marktab.net
http://curah.microsoft.com/342704/azure-machine-learning-
videos-february-2015

 http://datascience.sqlpass.org/

 https://www.youtube.com/channel/UCqB3xWdwjA9soFV6EOu7qfg
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608


More Related Content

PDF
Georgia Tech Data Science Hackathon September 2016
PDF
How Big Companies plan to use Our Big Data 201610
PDF
Big data landscape map collection by aibdp
PDF
Graph-based Product Lifecycle Management
PPTX
TechEvent biGenius What's New
PPTX
Encuentro Aporta 2016 - Mesa 2 - Miguel Arias
PDF
How to visualize Cosmos DB graph data
PDF
Microsoft Data Science Technologies 201505
Georgia Tech Data Science Hackathon September 2016
How Big Companies plan to use Our Big Data 201610
Big data landscape map collection by aibdp
Graph-based Product Lifecycle Management
TechEvent biGenius What's New
Encuentro Aporta 2016 - Mesa 2 - Miguel Arias
How to visualize Cosmos DB graph data
Microsoft Data Science Technologies 201505

What's hot (19)

PPTX
Sentiment Analysys on Social Media | DORA
PPTX
Keynote: Journey to the Cloud
PPTX
Text Analytics World - Expert System USA
PDF
Data tools ecosystem for non-programmers
PDF
Microsoft Technologies for Data Science sql_saturday_201505
PDF
Big data landscape v 3.0 - Matt Turck (FirstMark)
PPTX
Big Data, Big Deal? (A Big Data 101 presentation)
PPTX
Social Media and the Customer-centric Data Strategy #data17
PDF
How to identify the Return on Investment of Big Data / CIO (Infographic)
PDF
Google Analytics location data visualised with CARTO & BigQuery
PDF
Esri Maps for MicroStrategy
PDF
Industry trends.v0.1pptx
PPTX
Big data landscape version 2.0
PDF
DataWeave Introduction - Startup Saturday
PDF
You Have the Data, Now What? (Chris Lynch) - 2014 Boston Data Festival -
PPTX
Un caso di studio sui big data - Campus Connestions Summit 2018 - #CCS18
PPT
Impact Of The Internet On Secondary Data
PDF
Map reduce with big data
PPTX
Big Data
Sentiment Analysys on Social Media | DORA
Keynote: Journey to the Cloud
Text Analytics World - Expert System USA
Data tools ecosystem for non-programmers
Microsoft Technologies for Data Science sql_saturday_201505
Big data landscape v 3.0 - Matt Turck (FirstMark)
Big Data, Big Deal? (A Big Data 101 presentation)
Social Media and the Customer-centric Data Strategy #data17
How to identify the Return on Investment of Big Data / CIO (Infographic)
Google Analytics location data visualised with CARTO & BigQuery
Esri Maps for MicroStrategy
Industry trends.v0.1pptx
Big data landscape version 2.0
DataWeave Introduction - Startup Saturday
You Have the Data, Now What? (Chris Lynch) - 2014 Boston Data Festival -
Un caso di studio sui big data - Campus Connestions Summit 2018 - #CCS18
Impact Of The Internet On Secondary Data
Map reduce with big data
Big Data

Viewers also liked (7)

PDF
Insider's guide to azure machine learning 201606
PPTX
Hacking101 delhi 2013
PDF
Microsoft Technologies for Data Science 201612
PPTX
Create a Data Science Lab with Microsoft and Open Source tools
PPT
React js
KEY
Intro to Data Science for Enterprise Big Data
PDF
How to Become a Data Scientist
Insider's guide to azure machine learning 201606
Hacking101 delhi 2013
Microsoft Technologies for Data Science 201612
Create a Data Science Lab with Microsoft and Open Source tools
React js
Intro to Data Science for Enterprise Big Data
How to Become a Data Scientist

Similar to Microsoft Data Science Technologies 201608 (20)

PPTX
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
PPTX
DataMass Summit - Machine Learning for Big Data in SQL Server
PDF
Advanced analytics with R and SQL
PDF
microsoft r server for distributed computing
PDF
Michal Marušan: Scalable R
PDF
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
PDF
The Polyglot Data Scientist - Exploring R, Python, and SQL Server
PDF
Microsoft R Server for Data Sciencea
PDF
Bluegranite AA Webinar FINAL 28JUN16
PPTX
Microsoft Machine Learning Server. Architecture View
PDF
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
PPTX
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
PPTX
Azure machine learning ile tahminleme modelleri
PPTX
Building a Scalable Data Science Platform with R
PPTX
Derfor skal du bruge en DataLake
PPTX
Building a scalable data science platform with R
PPTX
Microsoft cloud big data strategy
PDF
Machine learning services with SQL Server 2017
PDF
Robert Luong: Analyse prédictive dans Excel
PDF
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
DataMass Summit - Machine Learning for Big Data in SQL Server
Advanced analytics with R and SQL
microsoft r server for distributed computing
Michal Marušan: Scalable R
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
The Polyglot Data Scientist - Exploring R, Python, and SQL Server
Microsoft R Server for Data Sciencea
Bluegranite AA Webinar FINAL 28JUN16
Microsoft Machine Learning Server. Architecture View
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Azure machine learning ile tahminleme modelleri
Building a Scalable Data Science Platform with R
Derfor skal du bruge en DataLake
Building a scalable data science platform with R
Microsoft cloud big data strategy
Machine learning services with SQL Server 2017
Robert Luong: Analyse prédictive dans Excel
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...

More from Mark Tabladillo (20)

PDF
How to find low-cost or free data science resources 202006
PDF
Microsoft Build 2020: Data Science Recap
PDF
201909 Automated ML for Developers
PDF
201908 Overview of Automated ML
PDF
201906 01 Introduction to ML.NET 1.0
PDF
201906 04 Overview of Automated ML June 2019
PDF
201906 03 Introduction to NimbusML
PDF
201906 02 Introduction to AutoML with ML.NET 1.0
PDF
201905 Azure Databricks for Machine Learning
PDF
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
PDF
Big Data Advanced Analytics on Microsoft Azure 201904
PDF
Managing Enterprise Data Science 201904
PDF
Training of Python scikit-learn models on Azure
PDF
Big Data Adavnced Analytics on Microsoft Azure
PDF
Advanced Analytics with Power BI 201808
PDF
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
PDF
Window functions for Data Science
PDF
Microsoft Technologies for Data Science 201601
PDF
Microsoft Data Science Technologies: Back Office Edition
PDF
Microsoft Data Science Technologies 201510
How to find low-cost or free data science resources 202006
Microsoft Build 2020: Data Science Recap
201909 Automated ML for Developers
201908 Overview of Automated ML
201906 01 Introduction to ML.NET 1.0
201906 04 Overview of Automated ML June 2019
201906 03 Introduction to NimbusML
201906 02 Introduction to AutoML with ML.NET 1.0
201905 Azure Databricks for Machine Learning
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
Big Data Advanced Analytics on Microsoft Azure 201904
Managing Enterprise Data Science 201904
Training of Python scikit-learn models on Azure
Big Data Adavnced Analytics on Microsoft Azure
Advanced Analytics with Power BI 201808
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Window functions for Data Science
Microsoft Technologies for Data Science 201601
Microsoft Data Science Technologies: Back Office Edition
Microsoft Data Science Technologies 201510

Recently uploaded (20)

PPTX
PPT for Diseases (1)-2, types of diseases.pptx
PPTX
Stats annual compiled ipd opd ot br 2024
PPTX
DAA UNIT 1 for unit 1 time compixity PPT.pptx
PDF
Hikvision-IR-PPT---EN.pdfSADASDASSAAAAAAAAAAAAAAA
PPTX
research framework and review of related literature chapter 2
PPTX
lung disease detection using transfer learning approach.pptx
PDF
Teal Blue Futuristic Metaverse Presentation.pdf
PPTX
Fkrjrkrkekekekeekkekswkjdjdjddwkejje.pptx
PPTX
cardiac failure and associated notes.pptx
PPTX
Capstone Presentation a.pptx on data sci
PDF
American Journal of Multidisciplinary Research and Review
PPTX
AI-Augmented Business Process Management Systems
PDF
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
PPT
What is life? We never know the answer exactly
PPTX
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
PPTX
inbound6529290805104538764.pptxmmmmmmmmm
PPTX
Sistem Informasi Manejemn-Sistem Manajemen Database
PPTX
Bussiness Plan S Group of college 2020-23 Final
PPTX
Overview_of_Computing_Presentation.pptxxx
PDF
Lesson 1 - intro Cybersecurity and Cybercrime.pptx.pdf
PPT for Diseases (1)-2, types of diseases.pptx
Stats annual compiled ipd opd ot br 2024
DAA UNIT 1 for unit 1 time compixity PPT.pptx
Hikvision-IR-PPT---EN.pdfSADASDASSAAAAAAAAAAAAAAA
research framework and review of related literature chapter 2
lung disease detection using transfer learning approach.pptx
Teal Blue Futuristic Metaverse Presentation.pdf
Fkrjrkrkekekekeekkekswkjdjdjddwkejje.pptx
cardiac failure and associated notes.pptx
Capstone Presentation a.pptx on data sci
American Journal of Multidisciplinary Research and Review
AI-Augmented Business Process Management Systems
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
What is life? We never know the answer exactly
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
inbound6529290805104538764.pptxmmmmmmmmm
Sistem Informasi Manejemn-Sistem Manajemen Database
Bussiness Plan S Group of college 2020-23 Final
Overview_of_Computing_Presentation.pptxxx
Lesson 1 - intro Cybersecurity and Cybercrime.pptx.pdf

Microsoft Data Science Technologies 201608