SlideShare a Scribd company logo
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
Familiar
Scalable
Secure
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
SQL Server
ML Services
Linux
Hadoop Teradata
Windows
CommercialCommunity
R ServerR Open
SQL Server 2017 Machine Learning Services
Installed Packages
Base
- stats
- graphics
- grDevices
- utils
- datasets
- methods
- base
Recommended
- boot
- class
- cluster
- codetools
- foreign
- kernSmooth
- lattice
- MASS
- Matrix
- mgcv
- nlme
- nnet
- rpart
- spatial
- survival
Microsoft
(Developed /
Maintained)
- checkpoint
- deployRserve
- doParallel
- foreach
- jsonlite
- iterators
- microsoftR
- RevoIOQ
- RevoMods
- RevoUtils
- RODBC
- RevoUtilsMath
- azureml
- rmr2
- rhdfs
- rhbase
- plyrmr
Open-Source #1
Additional
CRAN R
- curl
- jsonlite
- png
- R6
- RODBC
Microsoft R Open #2
(Intel MKL)
Microsoft R Server #4
Microsoft R Client (free) #3
Microsoft
(Developed /
Maintained)
- RevoScaleR
- MicrosoftML
- CompatibilityAPI
- mrupdate
- RevoIOQ
- RevoTreeView
- Mrsdeploy
- Sqlrutils
- olapR
Commercially licenced & supported
Open-Source
Open-Source
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
Iterate/
Sequence
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
Selecting Features
Selecting Features
New Services
Accepting License Agreements
SQL Server 2017 Machine Learning Services
Linux, Windows, Hadoop & Teradata
R Server Technology
Algorithm
Master
Predictive
Algorithm
Big
Data
Analyze
Blocks In
Parallel
Load Block
At A Time
Distribute Work,
Compile Results
“Pack and Ship”
Requests to
Remote
Environments
Results
Copyright Microsoft Corporation. All rights reserved.
Microsoft R Server “Client” Microsoft R Server “Server”
Console
R IDE or
command-
line REMOTE
CONTEXT
DI
R+CRAN
MicrosoftR
DistributedR
DeployR DevelopR
ScaleR
ConnectR
• Cloudera
• Hortonworks
• MapR
• Apache Spark
• IBM Platform LSF
• Microsoft HPC
Clusters
• SQL Server
• Teradata
Database
• Red Hat
• SuSE Servers
• Windows
DistributeR
### SETUP HADOOP ENVIRONMENT VARIABLES ###
myHadoopCC <- RxHadoopMR()
### HADOOP COMPUTE CONTEXT ###
rxSetComputeContext(myHadoopCC)
### CREATE HDFS, DIRECTORY AND FILE OBJECTS ###
hdfsFS <- RxHdfsFileSystem()
hdfsFS
### ANALYTICAL PROCESSING ###
### Statistical Summary of the data
rxSummary(~ArrDelay+DayOfWeek, data= AirlineDataSet, reportProgress=1)
### CrossTab the data
rxCrossTabs(ArrDelay ~ DayOfWeek, data= AirlineDataSet, means=T)
### Linear Model and plot
hdfsXdfArrLateLinMod <- rxLinMod(ArrDelay ~ DayOfWeek + 0 , data = AirlineDataSet)
plot(hdfsXdfArrLateLinMod$coefficients)
### SETUP LOCAL ENVIRONMENT VARIABLES ###
myLocalCC <- “localpar”
### LOCAL COMPUTE CONTEXT ###
rxSetComputeContext(myLocalCC)
### CREATE LINUX, DIRECTORY AND FILE OBJECTS ###
localFS <- RxNativeFileSystem()
AirlineDataSet <- RxXdfData(“AirlineDemoSmall.xdf”,
fileSystem = localFS)
Local Parallel processing – Linux or Windows In – Hadoop
Compute
context R script
– sets where the
model will run
Functional
model R script –
does not need
to change to run
in Hadoop
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services
EXECUTE sp_execute_external_script
@language = N'R'
, @script = N'x <- as.matrix(InputDataSet);
y <- array(dim1:dim2);
OutputDataSet <- as.data.frame(x %*% y);'
, @input_data_1 = N'SELECT [Col1] from MyData;’
, @params = N'@dim1 int, @dim2 int’
, @dim1 = 12, @dim2 = 15
WITH RESULT SETS (([Col1] int, [Col2] int, [Col3] int, [Col4] int));
launchpad.exe
sp_execute_external_script
sqlservr.exe
Named pipe
Each SQL
instance has a
launchpad
SQLOS
XEvent
MSSQLSERVER Service MSSQLLAUNCHPAD Service
“What” and “How”
to “launch”
“launcher”
Windows
“satellite” process
sqlsatellite.dll
Windows
“satellite” process
Windows
“satellite” process
Windows
“satellite” process
Windows
“satellite” process
SQL Server 2017 Machine Learning Services
SQL Server 2017 Machine Learning Services

More Related Content

PDF
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Аліна Шепшелей
 
PDF
Novinky v Oracle Database 18c
MarketingArrowECS_CZ
 
PPTX
SQL Server 2017 on Linux Introduction
Travis Wright
 
PDF
How easy (or hard) it is to monitor your graph ql service performance
Red Hat
 
PDF
SQL Server 2017 Machine Learning Services
Sorin Peste
 
PPTX
MANTL Data Platform, Microservices and BigData Services
Cisco DevNet
 
PDF
Getting Started Running Apache Spark on Apache Mesos
Paco Nathan
 
PDF
Dataservices - Processing Big Data The Microservice Way
Josef Adersberger
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Аліна Шепшелей
 
Novinky v Oracle Database 18c
MarketingArrowECS_CZ
 
SQL Server 2017 on Linux Introduction
Travis Wright
 
How easy (or hard) it is to monitor your graph ql service performance
Red Hat
 
SQL Server 2017 Machine Learning Services
Sorin Peste
 
MANTL Data Platform, Microservices and BigData Services
Cisco DevNet
 
Getting Started Running Apache Spark on Apache Mesos
Paco Nathan
 
Dataservices - Processing Big Data The Microservice Way
Josef Adersberger
 

What's hot (20)

PDF
Fully fault tolerant real time data pipeline with docker and mesos
Rahul Kumar
 
PPTX
Extending Windows Admin Center to manage your applications and infrastructure...
Microsoft Tech Community
 
PDF
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
SingleStore
 
PDF
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
Red Hat Developers
 
PDF
KSQL - Stream Processing simplified!
Guido Schmutz
 
PDF
30 Minutes to the Analytics Platform with Infrastructure as Code
Guido Schmutz
 
PDF
RHTE2015_CloudForms_Containers
Jerome Marc
 
PDF
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis
 
PDF
Unlock cassandra data for application developers using graphQL
Cédrick Lunven
 
PPTX
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
SingleStore
 
PPTX
Microsoft ignite 2018 SQL Server 2019 big data clusters - intro session
Travis Wright
 
PPTX
DevOps for Big Data - Data 360 2014 Conference
Grid Dynamics
 
PDF
Legacy Migration Overview
Bambordé Baldé
 
PDF
Big Data Tools in AWS
Shu-Jeng Hsieh
 
PPTX
Introducing Cloud Development with Mantl
Cisco DevNet
 
PPTX
Streaming Data from Scylla to Kafka
ScyllaDB
 
PDF
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
PDF
Monoliths to the cloud!
Luciano Mammino
 
PPTX
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Lucas Jellema
 
PPTX
Sql server hybrid what every sql professional should know
Bob Ward
 
Fully fault tolerant real time data pipeline with docker and mesos
Rahul Kumar
 
Extending Windows Admin Center to manage your applications and infrastructure...
Microsoft Tech Community
 
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
SingleStore
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
Red Hat Developers
 
KSQL - Stream Processing simplified!
Guido Schmutz
 
30 Minutes to the Analytics Platform with Infrastructure as Code
Guido Schmutz
 
RHTE2015_CloudForms_Containers
Jerome Marc
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis
 
Unlock cassandra data for application developers using graphQL
Cédrick Lunven
 
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
SingleStore
 
Microsoft ignite 2018 SQL Server 2019 big data clusters - intro session
Travis Wright
 
DevOps for Big Data - Data 360 2014 Conference
Grid Dynamics
 
Legacy Migration Overview
Bambordé Baldé
 
Big Data Tools in AWS
Shu-Jeng Hsieh
 
Introducing Cloud Development with Mantl
Cisco DevNet
 
Streaming Data from Scylla to Kafka
ScyllaDB
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
Monoliths to the cloud!
Luciano Mammino
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Lucas Jellema
 
Sql server hybrid what every sql professional should know
Bob Ward
 
Ad

Similar to SQL Server 2017 Machine Learning Services (20)

PDF
Microsoft R - Data Science at Scale
Sascha Dittmann
 
PDF
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
Debraj GuhaThakurta
 
PDF
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
Debraj GuhaThakurta
 
PDF
Microsoft R Server for Data Sciencea
Data Science Thailand
 
PPTX
Microsoft R - ScaleR Overview
Khalid Salama
 
PDF
Parallelizing Existing R Packages
Craig Warman
 
PDF
Deathstar
armstrtw
 
PDF
R the unsung hero of Big Data
Dhafer Malouche
 
KEY
Introduction to cloudforecast
Masahiro Nagano
 
PDF
SparkR: Enabling Interactive Data Science at Scale on Hadoop
DataWorks Summit
 
PDF
MLflow with R
Databricks
 
PDF
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
Jürgen Ambrosi
 
PDF
Introduction to Data Mining with R and Data Import/Export in R
Yanchang Zhao
 
PDF
Osd ctw spark
Wisely chen
 
PPT
r,rstats,r language,r packages
Ajay Ohri
 
PDF
Import web resources using R Studio
Rupak Roy
 
PPTX
Unit 2
vishal choudhary
 
PPTX
Flux - Open Machine Learning Stack / Pipeline
Jan Wiegelmann
 
PPTX
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
 
PPTX
Get started with R lang
senthil0809
 
Microsoft R - Data Science at Scale
Sascha Dittmann
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
Debraj GuhaThakurta
 
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
Debraj GuhaThakurta
 
Microsoft R Server for Data Sciencea
Data Science Thailand
 
Microsoft R - ScaleR Overview
Khalid Salama
 
Parallelizing Existing R Packages
Craig Warman
 
Deathstar
armstrtw
 
R the unsung hero of Big Data
Dhafer Malouche
 
Introduction to cloudforecast
Masahiro Nagano
 
SparkR: Enabling Interactive Data Science at Scale on Hadoop
DataWorks Summit
 
MLflow with R
Databricks
 
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
Jürgen Ambrosi
 
Introduction to Data Mining with R and Data Import/Export in R
Yanchang Zhao
 
Osd ctw spark
Wisely chen
 
r,rstats,r language,r packages
Ajay Ohri
 
Import web resources using R Studio
Rupak Roy
 
Flux - Open Machine Learning Stack / Pipeline
Jan Wiegelmann
 
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
 
Get started with R lang
senthil0809
 
Ad

Recently uploaded (20)

PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 

SQL Server 2017 Machine Learning Services

  • 6. SQL Server ML Services Linux Hadoop Teradata Windows CommercialCommunity R ServerR Open
  • 8. Installed Packages Base - stats - graphics - grDevices - utils - datasets - methods - base Recommended - boot - class - cluster - codetools - foreign - kernSmooth - lattice - MASS - Matrix - mgcv - nlme - nnet - rpart - spatial - survival Microsoft (Developed / Maintained) - checkpoint - deployRserve - doParallel - foreach - jsonlite - iterators - microsoftR - RevoIOQ - RevoMods - RevoUtils - RODBC - RevoUtilsMath - azureml - rmr2 - rhdfs - rhbase - plyrmr Open-Source #1 Additional CRAN R - curl - jsonlite - png - R6 - RODBC Microsoft R Open #2 (Intel MKL) Microsoft R Server #4 Microsoft R Client (free) #3 Microsoft (Developed / Maintained) - RevoScaleR - MicrosoftML - CompatibilityAPI - mrupdate - RevoIOQ - RevoTreeView - Mrsdeploy - Sqlrutils - olapR Commercially licenced & supported Open-Source Open-Source
  • 22. Linux, Windows, Hadoop & Teradata R Server Technology
  • 23. Algorithm Master Predictive Algorithm Big Data Analyze Blocks In Parallel Load Block At A Time Distribute Work, Compile Results “Pack and Ship” Requests to Remote Environments Results Copyright Microsoft Corporation. All rights reserved. Microsoft R Server “Client” Microsoft R Server “Server” Console R IDE or command- line REMOTE CONTEXT
  • 24. DI R+CRAN MicrosoftR DistributedR DeployR DevelopR ScaleR ConnectR • Cloudera • Hortonworks • MapR • Apache Spark • IBM Platform LSF • Microsoft HPC Clusters • SQL Server • Teradata Database • Red Hat • SuSE Servers • Windows DistributeR
  • 25. ### SETUP HADOOP ENVIRONMENT VARIABLES ### myHadoopCC <- RxHadoopMR() ### HADOOP COMPUTE CONTEXT ### rxSetComputeContext(myHadoopCC) ### CREATE HDFS, DIRECTORY AND FILE OBJECTS ### hdfsFS <- RxHdfsFileSystem() hdfsFS ### ANALYTICAL PROCESSING ### ### Statistical Summary of the data rxSummary(~ArrDelay+DayOfWeek, data= AirlineDataSet, reportProgress=1) ### CrossTab the data rxCrossTabs(ArrDelay ~ DayOfWeek, data= AirlineDataSet, means=T) ### Linear Model and plot hdfsXdfArrLateLinMod <- rxLinMod(ArrDelay ~ DayOfWeek + 0 , data = AirlineDataSet) plot(hdfsXdfArrLateLinMod$coefficients) ### SETUP LOCAL ENVIRONMENT VARIABLES ### myLocalCC <- “localpar” ### LOCAL COMPUTE CONTEXT ### rxSetComputeContext(myLocalCC) ### CREATE LINUX, DIRECTORY AND FILE OBJECTS ### localFS <- RxNativeFileSystem() AirlineDataSet <- RxXdfData(“AirlineDemoSmall.xdf”, fileSystem = localFS) Local Parallel processing – Linux or Windows In – Hadoop Compute context R script – sets where the model will run Functional model R script – does not need to change to run in Hadoop
  • 30. EXECUTE sp_execute_external_script @language = N'R' , @script = N'x <- as.matrix(InputDataSet); y <- array(dim1:dim2); OutputDataSet <- as.data.frame(x %*% y);' , @input_data_1 = N'SELECT [Col1] from MyData;’ , @params = N'@dim1 int, @dim2 int’ , @dim1 = 12, @dim2 = 15 WITH RESULT SETS (([Col1] int, [Col2] int, [Col3] int, [Col4] int));
  • 31. launchpad.exe sp_execute_external_script sqlservr.exe Named pipe Each SQL instance has a launchpad SQLOS XEvent MSSQLSERVER Service MSSQLLAUNCHPAD Service “What” and “How” to “launch” “launcher” Windows “satellite” process sqlsatellite.dll Windows “satellite” process Windows “satellite” process Windows “satellite” process Windows “satellite” process

Editor's Notes

  • #3: Slide Objective: Show the three pillars of Microsoft Advanced Analytics Talking Points: Microsoft’s Advanced Analytics products work with all your current investments – we support different platforms like Windows, Linux, SQL, Terada and even Big data. It works both on premise and in the cloud Microsoft has for long been investing in innovative Artificial Intelligence technologies and baking them into our products like Cortana, HoloLens, Bing and Skype. We are now commercializing these technologies through our advanced analytics products including Microsoft R. Microsoft want to help you accelerate the process of generating value from your data – which is why we are not only building the tools but investing heavily in creating solutions that can help you drive value.
  • #4: What does R mean in “R services”. R is a statistical computing programming language based on an Open Source Standard, R Open.
  • #5: Last but not least, customers need flexibility when it comes to the choice of platform, programming languages & data infrastructure to get from the most from their data.   Why? In most IT environments, platforms, technologies and skills are as diverse as they have ever been, the data platform of the future needs to you to build intelligent applications on any data, any platform, any language on premises and in the cloud.   SQL Server manages your data, across platforms, with any skills, on-premises & cloud Our goal is to meet you where you are with on any platform, anywhere with the tools and languages of your choice. SQL now has support for Windows, Linux & Docker Containers. It allows you to leverage the language of your choice for advanced analytics – R & Python.
  • #7: Slide objective Show broad commitment to R by preserving freely available, enhanced editions, Windows and SQL Server editions and R Server editions for leading EDWs, Linux and Hadoop platforms. Differentiate free, open editions from commercial by mentioning availability of commercial 24x7 support, and enhancements to support very large scale data analytics at speed. Talking points Notes
  • #8: Slide objective Illustrate the potential scale benefits possible with R Server’s ScaleR algorithms. Show a representative example and explain the 3 mechanisms that help achieve the improvements. . Talking points We tested the improved data and computational scale of the R Server’s ScaleR library of enhanced, parallelized algorithms. This is an example. Speed: On a 4 core laptop, with 8GB of RAM, open source R could process about 300,000 events in a particular data set prior to exhausting available memory. The test tool about 77 seconds to run the most commonly used R linear regression algorithm called “lm”. We than ran the same test using our parallelized, rewritten (in C++) linear regression module called rxLinMod. Data Scale Algorithms in the ScaleR library are also rewritten to analyze data in “chunks” to eliminate the memory-limits of typical open source R algorithms. Where the open source lm exhausted memory at about 300,000 events, the improved rxLinMod was working fine at 5 million events where we stopped testing. The result is a 50x performance improvement over open source linear regression, and no memory limits. Parallel Scale This example shows only the effects of running optimized, compoiled code on all cores of a laptop. Greater benefits are available. What is not shown, is that the work done to parallelize across 4 cores can also be utilized to scale across many nodes in systems such as EDWs and Hadoop. While results vary, the system, as you can see, responds linearly with respect to data size. Rehosting using R Server for Hadoop can provide even more dramatic speed and scale results. Notes
  • #12: - Wrangle data, experiment with models, and test models from a workstation - Use your favorite IDE or notebook service
  • #13: - Train models on big data, at speed, in parallel - Transform large data sets using T-SQL, R, and Python - Repeatedly score and rescore large data assets
  • #14: - Embed R or Python in T-SQL - Execute using T-SQL BI, reporting & app dev tools
  • #15: - Embed R and Python within T-SQL scripts - Makes R & Python callable from traditional applications - Deploy smart apps using existing skills & tools
  • #16: - Run trained models in real-time with low latency - Detect anomalies at speed
  • #23: Microsoft R Server is a broadly deployable enterprise-class analytics platform based on R that is supported, scalable and secure. Supporting a variety of big data statistics, predictive modeling and machine learning capabilities, R Server supports the full range of analytics – exploration, analysis, visualization and modeling Slide objective Introduce the high –level value of R Server and R Services over other instantiations of the R language. Talking points R Server products provide an enhanced experience for the R User without loss of compatibility R Server products are “open core” – the utilize the open source R product entirely and build new capabilities around that core without impacting compatibility. Users of R Server products enjoy full compatibility with open source compatible with the entire (and vast) collection of algorithms, connectors, visualization tools shared openly via CRAN, Bioconductor and other shared resources like GitHub. Key extensions enable R to tackle big data challenges that exceed the capacity of open source R. R Scripts built for one platform using R Server can be easily run on another platform running R Server We call it WODA – write once, deploy anywhere. Two key contributions: Build on any version of the product and deploy using other versions Investment protection as platform choices change Develop on the desktop and immediately deploy to RDBMS – SQL Server, EDW (SQL Server & Teradsata) or Hadoop (Microsoft, Cloudera, Hortonworks and MapR) Notes
  • #29: Slide Objective Present the range of already parallelized functions and algorithms available with RevoScaleR Talking Points This list shows the functions and algorithms that are available with all versions of R Server. We call this the ScaleR Library. Each function can: Execute work in step in parallel or serial as needed Process work using multiple threads, cores, sockets or nodes Process one or more data block in each thread, core, socket or node Combine the results into a single mathematically correct answer Do the work either locally or ship the request to another system for completion remotely. Completely obscure the complexity of parallelization, multiple steps and iterations from the R programmer Four functions, rxDataStep, rxExec, PEMA-R, and the newest rxExecBy provide frameworks for users to write their own routines – functions – algorithms using parallelization. While more difficult than pre-written PEMAs, the results are portable – usable on multiple systems Easier than writing directly to the platform to create custom algorithms. Notes: One algorithm framework – PEMA-R API, is not available in clustered systems – an exception to creating portability across systems.
  • #31: sp_execute_external_script is an example of a special proc or specproc. The source code of the procedure can’t be found in the resource db. It is implemented in our source code
  • #32: Talk about other “extensible” environments we have used in the past xproc sp_OA Linked servers Full-text
  • #33: Follow the instructions in insidesqlr\readme.txt