SlideShare a Scribd company logo
Big Data Predictive Analytics
with Revolution R Enterprise
David Smith
Gartner BI Conference, April 2014
Chief Community Officer
@revodavid
2
OUR COMPANY
The leading provider
of advanced analytics
software and services
based on open source R,
since 2007
OUR SOFTWARE
The only Big Data, Big
Analytics software platform
based on the data science
language R
KUDOS
Visionary
Gartner Magic Quadrant
for Advanced Analytics
Platforms, 2014
What is R?
 Most widely used data analysis software
• Used by 2M+ data scientists, statisticians and analysts
 Most powerful statistical programming language
• Flexible, extensible and comprehensive for productivity
 Create beautiful and unique data visualizations
• As seen in New York Times, Twitter and Flowing Data
 Thriving open-source community
• Leading edge of analytics research
 Fills the talent gap
• New graduates prefer R
R is Hot
bit.ly/r-is-hot
WHITE PAPER
Exploding growth and demand for R
 R is the highest paid IT skill
 R most-used data science language
after SQL
 R is used by 70% of data miners
 R is #15 of all programming languages
 R growing faster than any other data
science language
 R is the #1 Google Search for
Advanced Analytics software
 R has more than 2 million users
worldwide
R Usage Growth
Rexer Data Miner Survey, 2007-2013
70% of data miners report using R
R is the first choice of more
data miners than any other
software
Source: www.rexeranalytics.com
5
Technical Support for Open Source R
AdviseR™ from Revolution Analytics
Technical support for open source R, from the R experts.
 24x7 email and phone support
 On-line case management and knowledgebase
 Access to technical resources, documentation and user forums
 Exclusive on-line webinars from community experts
 Guaranteed response times
Also available: expert hands-on and on-line training for R, from
Revolution Analytics AcademyR.
www.revolutionanalytics.com/AdviseR
www.revolutionanalytics.com/AcademyR
Revolution R Enterprise
 High Performance, Scalable Analytics
 Portable Across Enterprise Platforms
 Easier to Build & Deploy Analytics
is….
the only big data big analytics platform
based on open source R
6
Big Data In-memory bound Hybrid memory & disk
scalability
Operates on bigger
volumes & factors
Speed of
Analysis
Single threaded Parallel threading Shrinks analysis time
Enterprise
Readiness
Community support Commercial support Delivers full service
production support
Analytic
Breadth &
Depth
5000+ innovative
analytic packages
Leverage open source
packages plus Big Data
ready packages
Supercharges R
Commercial
Viability
Risk of deployment
of open source
GPL-compatible
licensing
Eliminate risk with open
source
Enhancing Open Source R for the Enterprise
7
COMBINE INTERMEDIATE RESULTS
8
Powering Next Generation Analytics
Parallel External Memory Algorithms
 Unique PEMAs: Parallel,
external-memory algorithms
 High-performance, scalable
replacements for R/SAS
analytic functions
 Parallel/distributed
processing eliminates CPU
bottleneck
 Data streaming eliminates
memory size limitations
 Works with in-memory and
disk-based architectures
9
Eliminates Performance and Capacity
Limits of Open Source R and Legacy SAS
All of Open Source R plus:
 Big Data scalability
 High-performance analytics
 Development and deployment
tools
 Data source connectivity
 Application integration framework
 Multi-platform architecture
 Support, Training and Services
10
is the
Big Data Big Analytics Platform
DistributedR
ScaleR
ConnectR
DeployR
DESIGNED FOR SCALE, PORTABILITY & PERFORMANCE
In the Cloud Amazon AWS
Workstations & Servers Windows
Red Hat and SUSE Linux
Clustered Systems IBM Platform LSF
Microsoft HPC
EDW IBM Netezza
Teradata
Hadoop Hortonworks
Cloudera
11
Write Once.
Deploy Anywhere.
Write Once  Deploy Anywhere
rxSetComputeContext("local") # DEFAULT
rxSetComputeContext(RxHadoopMR(<data, server environment arguments>))
# Summarize and calculate descriptive statistics from the data airDS data set
adsSummary = rxSummary(~ArrDelay+CRSDepTime+DayOfWeek, data = airDS)
# Fit Linear Regression Model
arrDelayLm1 = rxLinMod(ArrDelay ~ DayOfWeek, data = airDS); summary(arrDelayLm1)
rxSetComputeContext(RxHpcServer(<data, server environment arguments>))
rxSetComputeContext(RxLsfCluster(<data, server environment arguments>))
Same code to be run anywhere …..
Local System
(default)




Set the desired compute context for code execution…..
rxSetComputeContext(RxTeradata(<data, server environment arguments>))

13
In-Hadoop Big Data Big Analytics
 Eliminate data
movement latency
 Speed model
development
 Use commodity
Hadoop nodes as
analytics engine
Name Node
Data NodeData Node Data NodeData Node Data Node
Job
Tracker
Task
Tracker
Task
Tracker
Task
Tracker
Task
Tracker
Task
Tracker
MapReduce
HDFS
14
Revolution Analytics coupled with the Teradata Unified Data Architecture accelerates
big data analytics with the R language.
+
In-Database Analytics:
 Parallel R in-database for big
data analytics on Teradata
 Build parallel R models
completely in R
 Use Teradata appliance as
analytics engine
 No need to move data
Teradata
14.10
+
Revolution R
Enterprise V7
15
RRE7 in the Cloud
 Revolution R Enterprise 7, on the industry-leading cloud platform
 Pay as you go, priced by cores x hours
– No long-term commitment required
 Launch Windows and Linux servers on demand
– Windows 2008 R2 with DevelopR
– RHEL 6 with RStudio Server Professional
– Server instances from 2 – 32 cores
– Analyze data sets up to 2 TB
 Convenient, consistent and reliable
– Available globally, accessible anywhere
– Forum-based support with registration
 Free 14-day trial available
CLOUD SERVERS
$0.70
PER CORE/HOUR
PLUS AWS INFRASTRUCTURE COSTS
Revolution R Enterprise Ecosystem
Integration with the Big Data Analytics Stack
Deployment / Consumption
Data / Infrastructure
Advanced Analytics
ETL
SI / Service MSP / DSP
16
How Customers Revolutionize their Business
Power
“We’ve combined Revolution R
Enterprise and Hadoop to build and
deploy customized exploratory data
analysis and GAM survival models for
our marketing performance
management and attribution platform.
Given that our data sets are already in
the terabytes and are growing rapidly,
we depend on Revolution R Enterprise’s
scalability and power – we saw about
a 4x performance improvement on 50
million records. It works brilliantly.”
- CEO, John Wallace, DataSong
4X performance
50M records scored daily
Scalability
“We’ve been able to scale our solution to a
problem that’s so big that most companies could
not address it. If we had to go with a different
solution we wouldn’t be as efficient as we are
now.”
- SVP Analytics, Kevin Lyons, eXelate
TB’s data from 200+ data sources
10’s thousands attributes
100’s millions of scores daily
2X data
2X attributes
no impact on performance
Performance
“We need a high-performance analytics
infrastructure because marketing optimization is a
lot like a financial trading. By watching the market
constantly for data or market condition updates,
we can now identify opportunities for our
clients that would otherwise be lost.”
- Chief Analytics Officer, Leon Zemel, [x+1]
Why Revolution R Enterprise?
18
Platform
Independence
Take Big Cost Out
of Big Data
Supercharge R for
Massive Data
Power R for the
Enterprise
Thank You
David Smith
Chief Community Officer
@revodavid
blog.revolutionanalytics.com

More Related Content

PDF
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
Revolution Analytics
 
PDF
Big Data - Analytics with R
Techsparks
 
PPTX
Revolution Analytics: a 5-minute history
Revolution Analytics
 
PPTX
American Century (Revolution Analytics Customer Day)
Revolution Analytics
 
PDF
Batter Up! Advanced Sports Analytics with R and Storm
Revolution Analytics
 
PDF
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Revolution Analytics
 
PPTX
Taking R Analytics to SQL and the Cloud
Revolution Analytics
 
PDF
Applications in R - Success and Lessons Learned from the Marketplace
Revolution Analytics
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
Revolution Analytics
 
Big Data - Analytics with R
Techsparks
 
Revolution Analytics: a 5-minute history
Revolution Analytics
 
American Century (Revolution Analytics Customer Day)
Revolution Analytics
 
Batter Up! Advanced Sports Analytics with R and Storm
Revolution Analytics
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Revolution Analytics
 
Taking R Analytics to SQL and the Cloud
Revolution Analytics
 
Applications in R - Success and Lessons Learned from the Marketplace
Revolution Analytics
 

What's hot (20)

PDF
R and Big Data using Revolution R Enterprise with Hadoop
Revolution Analytics
 
PDF
In-Database Analytics Deep Dive with Teradata and Revolution
Revolution Analytics
 
PDF
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Willy Marroquin (WillyDevNET)
 
PPTX
DeployR: Revolution R Enterprise with Business Intelligence Applications
Revolution Analytics
 
PDF
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution Analytics
 
PDF
Big Data Analytics with R
Great Wide Open
 
PDF
Microsoft R Server for Data Sciencea
Data Science Thailand
 
PPTX
Building a Scalable Data Science Platform with R
DataWorks Summit/Hadoop Summit
 
PPTX
How the growth of R helps data-driven organizations succeed
Revolution Analytics
 
PPTX
Big data business case
Karthik Padmanabhan ( MLE℠)
 
PPTX
The Business Economics and Opportunity of Open Source Data Science
Revolution Analytics
 
PPTX
The network structure of cran 2015 07-02 final
Revolution Analytics
 
PDF
R for SAS Users Complement or Replace Two Strategies
Revolution Analytics
 
PDF
Data Science At Zillow
Nicholas McClure
 
PDF
Intro to R for SAS and SPSS User Webinar
Revolution Analytics
 
PPTX
Revolution Analytics Podcast
inside-BigData.com
 
PPTX
Calum McCrea, Software Engineer at Kx Systems, "Kx: How Wall Street Tech can ...
Dataconomy Media
 
PDF
Introduction to Microsoft R Services
Gregg Barrett
 
PPTX
Managing a Multi-Tenant Data Lake
DataWorks Summit/Hadoop Summit
 
PDF
High Performance Spatial-Temporal Trajectory Analysis with Spark
DataWorks Summit/Hadoop Summit
 
R and Big Data using Revolution R Enterprise with Hadoop
Revolution Analytics
 
In-Database Analytics Deep Dive with Teradata and Revolution
Revolution Analytics
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Willy Marroquin (WillyDevNET)
 
DeployR: Revolution R Enterprise with Business Intelligence Applications
Revolution Analytics
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution Analytics
 
Big Data Analytics with R
Great Wide Open
 
Microsoft R Server for Data Sciencea
Data Science Thailand
 
Building a Scalable Data Science Platform with R
DataWorks Summit/Hadoop Summit
 
How the growth of R helps data-driven organizations succeed
Revolution Analytics
 
Big data business case
Karthik Padmanabhan ( MLE℠)
 
The Business Economics and Opportunity of Open Source Data Science
Revolution Analytics
 
The network structure of cran 2015 07-02 final
Revolution Analytics
 
R for SAS Users Complement or Replace Two Strategies
Revolution Analytics
 
Data Science At Zillow
Nicholas McClure
 
Intro to R for SAS and SPSS User Webinar
Revolution Analytics
 
Revolution Analytics Podcast
inside-BigData.com
 
Calum McCrea, Software Engineer at Kx Systems, "Kx: How Wall Street Tech can ...
Dataconomy Media
 
Introduction to Microsoft R Services
Gregg Barrett
 
Managing a Multi-Tenant Data Lake
DataWorks Summit/Hadoop Summit
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
DataWorks Summit/Hadoop Summit
 
Ad

Similar to Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014) (20)

PPTX
Decision trees in hadoop
Revolution Analytics
 
PPTX
Big data analytics on teradata with revolution r enterprise bill jacobs
Bill Jacobs
 
PPTX
Revolution R Enterprise - Portland R User Group, November 2013
Revolution Analytics
 
PPTX
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
Revolution Analytics
 
PDF
High Performance Predictive Analytics in R and Hadoop
Revolution Analytics
 
PDF
Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
PPTX
High Performance Predictive Analytics in R and Hadoop
DataWorks Summit
 
PPTX
Revolution R: 100% R and more
Revolution Analytics
 
PPTX
Are You Ready for Big Data Big Analytics?
Revolution Analytics
 
PDF
Revolution R Enterprise: 100% R and More (14 Mar 2013)
Revolution Analytics
 
PDF
100% R and More: Plus What's New in Revolution R Enterprise 6.0
Revolution Analytics
 
PDF
What's New in Revolution R Enterprise 6.2
Revolution Analytics
 
PDF
Big Data Analysis Starts with R
Revolution Analytics
 
PDF
Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise: A S...
Revolution Analytics
 
PDF
High Performance Predictive Analytics in R and Hadoop
Revolution Analytics
 
PDF
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
Revolution Analytics
 
PDF
Bluegranite AA Webinar FINAL 28JUN16
Andy Lathrop
 
PDF
Creating Value That Scales with Revolution Analytics & Alteryx
Revolution Analytics
 
PPTX
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
Revolution Analytics
 
PPTX
Revolution R: 100% R and more
Masayoshi Ootsuka
 
Decision trees in hadoop
Revolution Analytics
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Bill Jacobs
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution Analytics
 
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
Revolution Analytics
 
High Performance Predictive Analytics in R and Hadoop
Revolution Analytics
 
Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
High Performance Predictive Analytics in R and Hadoop
DataWorks Summit
 
Revolution R: 100% R and more
Revolution Analytics
 
Are You Ready for Big Data Big Analytics?
Revolution Analytics
 
Revolution R Enterprise: 100% R and More (14 Mar 2013)
Revolution Analytics
 
100% R and More: Plus What's New in Revolution R Enterprise 6.0
Revolution Analytics
 
What's New in Revolution R Enterprise 6.2
Revolution Analytics
 
Big Data Analysis Starts with R
Revolution Analytics
 
Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise: A S...
Revolution Analytics
 
High Performance Predictive Analytics in R and Hadoop
Revolution Analytics
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
Revolution Analytics
 
Bluegranite AA Webinar FINAL 28JUN16
Andy Lathrop
 
Creating Value That Scales with Revolution Analytics & Alteryx
Revolution Analytics
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
Revolution Analytics
 
Revolution R: 100% R and more
Masayoshi Ootsuka
 
Ad

More from Revolution Analytics (20)

PPTX
Speeding up R with Parallel Programming in the Cloud
Revolution Analytics
 
PPTX
Migrating Existing Open Source Machine Learning to Azure
Revolution Analytics
 
PPTX
R in Minecraft
Revolution Analytics
 
PPTX
The case for R for AI developers
Revolution Analytics
 
PPTX
Speed up R with parallel programming in the Cloud
Revolution Analytics
 
PPTX
The R Ecosystem
Revolution Analytics
 
PPTX
R Then and Now
Revolution Analytics
 
PPTX
Predicting Loan Delinquency at One Million Transactions per Second
Revolution Analytics
 
PPTX
Reproducible Data Science with R
Revolution Analytics
 
PPTX
The Value of Open Source Communities
Revolution Analytics
 
PPTX
The R Ecosystem
Revolution Analytics
 
PPTX
R at Microsoft (useR! 2016)
Revolution Analytics
 
PPTX
Building a scalable data science platform with R
Revolution Analytics
 
PPTX
R at Microsoft
Revolution Analytics
 
PPTX
The Network structure of R packages on CRAN & BioConductor
Revolution Analytics
 
PPTX
Simple Reproducibility with the checkpoint package
Revolution Analytics
 
PPTX
R at Microsoft
Revolution Analytics
 
PDF
Warranty Predictive Analytics solution
Revolution Analytics
 
PPTX
Reproducibility with Checkpoint & RRO - NYC R Conference
Revolution Analytics
 
PDF
Reproducibility with Revolution R Open and the Checkpoint Package
Revolution Analytics
 
Speeding up R with Parallel Programming in the Cloud
Revolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Revolution Analytics
 
R in Minecraft
Revolution Analytics
 
The case for R for AI developers
Revolution Analytics
 
Speed up R with parallel programming in the Cloud
Revolution Analytics
 
The R Ecosystem
Revolution Analytics
 
R Then and Now
Revolution Analytics
 
Predicting Loan Delinquency at One Million Transactions per Second
Revolution Analytics
 
Reproducible Data Science with R
Revolution Analytics
 
The Value of Open Source Communities
Revolution Analytics
 
The R Ecosystem
Revolution Analytics
 
R at Microsoft (useR! 2016)
Revolution Analytics
 
Building a scalable data science platform with R
Revolution Analytics
 
R at Microsoft
Revolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
Revolution Analytics
 
Simple Reproducibility with the checkpoint package
Revolution Analytics
 
R at Microsoft
Revolution Analytics
 
Warranty Predictive Analytics solution
Revolution Analytics
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Revolution Analytics
 
Reproducibility with Revolution R Open and the Checkpoint Package
Revolution Analytics
 

Recently uploaded (20)

PDF
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
PDF
Protecting the Digital World Cyber Securit
dnthakkar16
 
PDF
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
PPTX
Presentation about variables and constant.pptx
kr2589474
 
PPTX
Presentation about Database and Database Administrator
abhishekchauhan86963
 
PPTX
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
PDF
Generating Union types w/ Static Analysis
K. Matthew Dupree
 
PPTX
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
PPTX
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
PDF
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
PPTX
Explanation about Structures in C language.pptx
Veeral Rathod
 
PDF
Immersive experiences: what Pharo users do!
ESUG
 
PDF
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
DOCX
Can You Build Dashboards Using Open Source Visualization Tool.docx
Varsha Nayak
 
PPTX
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
PDF
An Experience-Based Look at AI Lead Generation Pricing, Features & B2B Results
Thomas albart
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PPTX
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
PDF
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
Protecting the Digital World Cyber Securit
dnthakkar16
 
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
Presentation about variables and constant.pptx
kr2589474
 
Presentation about Database and Database Administrator
abhishekchauhan86963
 
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
Generating Union types w/ Static Analysis
K. Matthew Dupree
 
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
Explanation about Structures in C language.pptx
Veeral Rathod
 
Immersive experiences: what Pharo users do!
ESUG
 
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
Can You Build Dashboards Using Open Source Visualization Tool.docx
Varsha Nayak
 
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
An Experience-Based Look at AI Lead Generation Pricing, Features & B2B Results
Thomas albart
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 

Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit 2014)

  • 1. Big Data Predictive Analytics with Revolution R Enterprise David Smith Gartner BI Conference, April 2014 Chief Community Officer @revodavid
  • 2. 2 OUR COMPANY The leading provider of advanced analytics software and services based on open source R, since 2007 OUR SOFTWARE The only Big Data, Big Analytics software platform based on the data science language R KUDOS Visionary Gartner Magic Quadrant for Advanced Analytics Platforms, 2014
  • 3. What is R?  Most widely used data analysis software • Used by 2M+ data scientists, statisticians and analysts  Most powerful statistical programming language • Flexible, extensible and comprehensive for productivity  Create beautiful and unique data visualizations • As seen in New York Times, Twitter and Flowing Data  Thriving open-source community • Leading edge of analytics research  Fills the talent gap • New graduates prefer R R is Hot bit.ly/r-is-hot WHITE PAPER
  • 4. Exploding growth and demand for R  R is the highest paid IT skill  R most-used data science language after SQL  R is used by 70% of data miners  R is #15 of all programming languages  R growing faster than any other data science language  R is the #1 Google Search for Advanced Analytics software  R has more than 2 million users worldwide R Usage Growth Rexer Data Miner Survey, 2007-2013 70% of data miners report using R R is the first choice of more data miners than any other software Source: www.rexeranalytics.com
  • 5. 5 Technical Support for Open Source R AdviseR™ from Revolution Analytics Technical support for open source R, from the R experts.  24x7 email and phone support  On-line case management and knowledgebase  Access to technical resources, documentation and user forums  Exclusive on-line webinars from community experts  Guaranteed response times Also available: expert hands-on and on-line training for R, from Revolution Analytics AcademyR. www.revolutionanalytics.com/AdviseR www.revolutionanalytics.com/AcademyR
  • 6. Revolution R Enterprise  High Performance, Scalable Analytics  Portable Across Enterprise Platforms  Easier to Build & Deploy Analytics is…. the only big data big analytics platform based on open source R 6
  • 7. Big Data In-memory bound Hybrid memory & disk scalability Operates on bigger volumes & factors Speed of Analysis Single threaded Parallel threading Shrinks analysis time Enterprise Readiness Community support Commercial support Delivers full service production support Analytic Breadth & Depth 5000+ innovative analytic packages Leverage open source packages plus Big Data ready packages Supercharges R Commercial Viability Risk of deployment of open source GPL-compatible licensing Eliminate risk with open source Enhancing Open Source R for the Enterprise 7
  • 8. COMBINE INTERMEDIATE RESULTS 8 Powering Next Generation Analytics Parallel External Memory Algorithms
  • 9.  Unique PEMAs: Parallel, external-memory algorithms  High-performance, scalable replacements for R/SAS analytic functions  Parallel/distributed processing eliminates CPU bottleneck  Data streaming eliminates memory size limitations  Works with in-memory and disk-based architectures 9 Eliminates Performance and Capacity Limits of Open Source R and Legacy SAS
  • 10. All of Open Source R plus:  Big Data scalability  High-performance analytics  Development and deployment tools  Data source connectivity  Application integration framework  Multi-platform architecture  Support, Training and Services 10 is the Big Data Big Analytics Platform
  • 11. DistributedR ScaleR ConnectR DeployR DESIGNED FOR SCALE, PORTABILITY & PERFORMANCE In the Cloud Amazon AWS Workstations & Servers Windows Red Hat and SUSE Linux Clustered Systems IBM Platform LSF Microsoft HPC EDW IBM Netezza Teradata Hadoop Hortonworks Cloudera 11 Write Once. Deploy Anywhere.
  • 12. Write Once  Deploy Anywhere rxSetComputeContext("local") # DEFAULT rxSetComputeContext(RxHadoopMR(<data, server environment arguments>)) # Summarize and calculate descriptive statistics from the data airDS data set adsSummary = rxSummary(~ArrDelay+CRSDepTime+DayOfWeek, data = airDS) # Fit Linear Regression Model arrDelayLm1 = rxLinMod(ArrDelay ~ DayOfWeek, data = airDS); summary(arrDelayLm1) rxSetComputeContext(RxHpcServer(<data, server environment arguments>)) rxSetComputeContext(RxLsfCluster(<data, server environment arguments>)) Same code to be run anywhere ….. Local System (default)     Set the desired compute context for code execution….. rxSetComputeContext(RxTeradata(<data, server environment arguments>)) 
  • 13. 13 In-Hadoop Big Data Big Analytics  Eliminate data movement latency  Speed model development  Use commodity Hadoop nodes as analytics engine Name Node Data NodeData Node Data NodeData Node Data Node Job Tracker Task Tracker Task Tracker Task Tracker Task Tracker Task Tracker MapReduce HDFS
  • 14. 14 Revolution Analytics coupled with the Teradata Unified Data Architecture accelerates big data analytics with the R language. + In-Database Analytics:  Parallel R in-database for big data analytics on Teradata  Build parallel R models completely in R  Use Teradata appliance as analytics engine  No need to move data Teradata 14.10 + Revolution R Enterprise V7
  • 15. 15 RRE7 in the Cloud  Revolution R Enterprise 7, on the industry-leading cloud platform  Pay as you go, priced by cores x hours – No long-term commitment required  Launch Windows and Linux servers on demand – Windows 2008 R2 with DevelopR – RHEL 6 with RStudio Server Professional – Server instances from 2 – 32 cores – Analyze data sets up to 2 TB  Convenient, consistent and reliable – Available globally, accessible anywhere – Forum-based support with registration  Free 14-day trial available CLOUD SERVERS $0.70 PER CORE/HOUR PLUS AWS INFRASTRUCTURE COSTS
  • 16. Revolution R Enterprise Ecosystem Integration with the Big Data Analytics Stack Deployment / Consumption Data / Infrastructure Advanced Analytics ETL SI / Service MSP / DSP 16
  • 17. How Customers Revolutionize their Business Power “We’ve combined Revolution R Enterprise and Hadoop to build and deploy customized exploratory data analysis and GAM survival models for our marketing performance management and attribution platform. Given that our data sets are already in the terabytes and are growing rapidly, we depend on Revolution R Enterprise’s scalability and power – we saw about a 4x performance improvement on 50 million records. It works brilliantly.” - CEO, John Wallace, DataSong 4X performance 50M records scored daily Scalability “We’ve been able to scale our solution to a problem that’s so big that most companies could not address it. If we had to go with a different solution we wouldn’t be as efficient as we are now.” - SVP Analytics, Kevin Lyons, eXelate TB’s data from 200+ data sources 10’s thousands attributes 100’s millions of scores daily 2X data 2X attributes no impact on performance Performance “We need a high-performance analytics infrastructure because marketing optimization is a lot like a financial trading. By watching the market constantly for data or market condition updates, we can now identify opportunities for our clients that would otherwise be lost.” - Chief Analytics Officer, Leon Zemel, [x+1]
  • 18. Why Revolution R Enterprise? 18 Platform Independence Take Big Cost Out of Big Data Supercharge R for Massive Data Power R for the Enterprise
  • 19. Thank You David Smith Chief Community Officer @revodavid blog.revolutionanalytics.com