SlideShare a Scribd company logo
 What is R?
                                                                                         R’s Advantages
                                                                                         R’s Disadvantages
                                                                                         Installing and Maintaining R
                                                                                         Ways of Running R
Bob Muenchen, Author R for SAS and SPSS Users,
                                                                                         An Example Program
         Co-Author R for Stata Users
                                                                                         Where to Learn More
 muenchen.bob@gmail.com, http://r4stats.com




            Copyright © 2010, 2011, Robert A Muenchen. All rights reserved.                                                             2




                                                                                   “The most powerful statistical computing language
                                                                                      on the planet.” -Norman Nie, Developer of SPSS
                                                                                     Language + package + environment for
                                                                                      graphics and data analysis
                                                                                     Free and open source
                                                                                     Created by Ross Ihaka & Robert Gentleman 1996
                                                                                      & extended by many more
                                                                                     An implementation of the S language by
                                                                                      John Chambers and others
                                                                                     R has 4,950 add-ons, or nearly 100,000 procs


                                                                              3                                                         4
5                Source: r4stats.com/popularity
                                                                                                      6
                    http://r4stats.com/popularity




1.   Data input & management (data step)                * SAS Approach;
2.   Analytics & graphics procedures (proc step)        DATA A; SET A;
3.   Macro language                                       logX = log(X);
4.   Matrix language                                    PROC REG;
5.   Output management systems (ODS/OMS)                  MODEL Y = logX;

R integrates these all seamlessly.                      # R Approach
                                                        lm( Y ~ log(X) )

                                                    7                                                 8
 Vast selection of analytics & graphics
 New methods are available sooner
 Many packages can run R (SAS, SPSS, Excel…)
 Its object orientation “does the right thing”
 Its language is powerful & fully integrated
 Procedures you write are on an equal footing
 It is the universal language of data analysis
 It runs on any computer
 Being open source, you can study and modify it
 It is free

                                                          9                                                             10




* Using SAS;                                                    Language is somewhat harder to learn
PROC TTEST DATA=classroom;                                      Help files are sparse & complex
CLASS gender;                                                   Must find R and its add-ons yourself
VAR score;
                                                                Graphical user interfaces not as polished
                                                                Most R functions hold data in main memory
# In R
                                                                  Rule-of-thumb: 10 million values per gigabyte
t.test(score ~ gender, data=classroom)
                                                                  SAS/SPSS: billions of records
                                                                  Several efforts underway to break R’s memory limit
t.test(posttest, pretest , paired=TRUE, data=classroom)            including Revolution Analytics’ distribution


                                                          11                                                            12
 Base R plus Recommended Packages like:                        Email support is free, quick, 24-hours:
      Base SAS, SAS/STAT, SAS/GRAPH, SAS/IML Studio              www.r-project.org/mail.html
      SPSS Stat. Base, SPSS Stat. Advanced, Regression           Stackoverflow.com
 Tested via extensive validation programs                        Quora.com
 But add-on packages written by…                                 Crossvalidated stats.stackexchange.com
      Professor who invented the method?                          /questions/tagged/r
      A student interpreting the method?                       Phone support available commercially




                                                          13                                                14




1. Go to cran.r-project.org,                                    Comprehensive R Archive Network
   the Comprehensive R Archive Network
                                                                Crantastic.com
2. Download binaries for Base & run                             Inside-R.org
3. Add-ons:                                                     R4Stats.com
   install.packages(“myPackage”)
4. To update: update.packages()




                                                  15                                                        16
17   18




19   20
 Run code interactively
      Submit code from Excel, SAS, SPSS,…
      Point-n-click using
       Graphical User Interfaces (GUIs)
      Batch mode




21
                                             22




23
                                             24
Copyright © 2010, 2011, Robert A Muenchen. All rights reserved.        26
                               25




run ExportDataSetToR("mydata");     GET FILE=‘mydata.sav’.
                                    BEGIN PROGRAM R.
submit/r;
                                    mydata <- spssdata.GetDataFromSPSS(
   mydata$workshop <-
                                      variables = c("workshop gender
     factor(mydata$workshop)
                                      q1 to q4"),
   summary(mydata)                    missingValueToNA = TRUE,
endsubmit;                            row.label = "id" )
                                    summary(mydata)
                                    END PROGRAM.

                               27                                                                               28
29   30




          32
31
34
                                              33




 A company focused on R development & support
 Run by SPSS founder Norman Nie
 Their enhanced distribution of R:
  Revolution R Enterprise
 Free for colleges and universities, including for
  outside consulting




                                                      35
Intro to R for SAS and SPSS User Webinar
43   44
mydata <- read.csv("mydata.csv")                                      > mydata <- read.csv("mydata.csv")
 print(mydata)                                                         > print(mydata)
                                                                          workshop gender q1 q2 q3 q4
 mydata$workshop <- factor(mydata$workshop)
                                                                       1        1      f 1 1 5 1
 summary(mydata)
                                                                       2        2      f 2 1 4 1
 plot( mydata$q1, mydata$q4 )                                          3        1      f 2 2 4 3
                                                                       4        2   <NA> 3 1 NA 3
 myModel <- lm( q4~q1+q2+q3, data=mydata )                             5        1      m 4 5 2 4
 summary( myModel )                                                    6        2      m 5 4 5 5
 anova( myModel )                                                      7        1      m 5 3 4 4
 plot( myModel )
                                                                       8        2      m 4 5 5 5
                                                                  45                                        46




> mydata$workshop <-factor(mydata$workshop)
> summary(mydata)
 workshop       gender
 1:4        f      :3
 2:4        m      :4
            NA's:1
q1                  q2             q3              q4
Min.   :1.00        Min.   :1.00   Min.   :2.000   Min.   :1.00
1st Qu.:2.00        1st Qu.:1.00   1st Qu.:4.000   1st Qu.:2.50
Median :3.50        Median :2.50   Median :4.000   Median :3.50
Mean   :3.25        Mean   :2.75   Mean   :4.143   Mean   :3.25
3rd Qu.:4.25        3rd Qu.:4.25   3rd Qu.:5.000   3rd Qu.:4.25
Max.   :5.00        Max.   :5.00   Max.   :5.000   Max.   :5.00
                                   NA's   :1.000
                                                                  47                                        48
> myModel <- lm(q4 ~ q1+q2+q3, data=mydata)
> summary(myModel)

Call:
lm(formula = q4 ~ q1 + q2 + q3, data = mydata)
Residuals:
      1       2       3       5       6        7      8
-0.3113 -0.4261 0.9428 -0.1797 0.0765 0.0225 -0.1246
Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.3243      1.2877 -1.028     0.379
q1            0.4297     0.2623   1.638    0.200
q2            0.6310     0.2503   2.521    0.086
q3            0.3150     0.2557   1.232    0.306
Multiple R-squared: 0.9299,     Adjusted R-squared: 0.8598
F-statistic: 13.27 on 3 and 3 DF, p-value: 0.03084


                                                             49   Copyright © 2010, 2011, Robert A Muenchen. All rights reserved.   50




                                                             51                                                                     52
 R for SAS and SPSS Users, Muenchen
                                                          R for Stata Users, Muenchen & Hilbe
                                                          R Through Excel: A Spreadsheet Interface for Statistics,
                                                           Data Analysis, and Graphics, Heiberger & Neuwirth
                                                          Data Mining with Rattle and R: The Art of Excavating
                                                           Data for Knowledge Discovery, Williams




                                                    53                                                                54




 R is powerful, extensible, free
 Download it from CRAN
 Academics download Revolution R Enterprise
  for free at www.revolutionanalytics.com
 You run it many ways & from many packages
                                                                              muenchen@utk.edu
 Several graphical user interfaces are available
 R's programming language is the way                                   Slides: r4stats.com/misc/webinar
                                                                         Presentation: bit.ly/R-sas-spss
  to access its full power


                                                    55

More Related Content

What's hot (20)

PPTX
DeployR: Revolution R Enterprise with Business Intelligence Applications
Revolution Analytics
 
PDF
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Willy Marroquin (WillyDevNET)
 
PDF
Introduction to Microsoft R Services
Gregg Barrett
 
PDF
Basics of Digital Design and Verilog
Ganesan Narayanasamy
 
PPTX
Revolution R Enterprise - Portland R User Group, November 2013
Revolution Analytics
 
PDF
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Revolution Analytics
 
PPTX
Big data business case
Karthik Padmanabhan ( MLE℠)
 
PPTX
The network structure of cran 2015 07-02 final
Revolution Analytics
 
PPTX
Big data analytics using R
Karthik Padmanabhan ( MLE℠)
 
PDF
Microsoft R Server for Data Sciencea
Data Science Thailand
 
PPTX
R at Microsoft (useR! 2016)
Revolution Analytics
 
PPTX
R for data analytics
VijayMohan Vasu
 
PPTX
The R Ecosystem
Revolution Analytics
 
PDF
Data Science At Zillow
Nicholas McClure
 
PPTX
Predicting Loan Delinquency at One Million Transactions per Second
Revolution Analytics
 
PDF
R and-hadoop
Bryan Downing
 
PPTX
R at Microsoft
Revolution Analytics
 
PDF
Meetup Oracle Database BCN: 2.1 Data Management Trends
avanttic Consultoría Tecnológica
 
PPTX
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Cesare Cugnasco
 
PPTX
Data Analytics Domain
Multisoft Virtual Academy
 
DeployR: Revolution R Enterprise with Business Intelligence Applications
Revolution Analytics
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Willy Marroquin (WillyDevNET)
 
Introduction to Microsoft R Services
Gregg Barrett
 
Basics of Digital Design and Verilog
Ganesan Narayanasamy
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution Analytics
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Revolution Analytics
 
Big data business case
Karthik Padmanabhan ( MLE℠)
 
The network structure of cran 2015 07-02 final
Revolution Analytics
 
Big data analytics using R
Karthik Padmanabhan ( MLE℠)
 
Microsoft R Server for Data Sciencea
Data Science Thailand
 
R at Microsoft (useR! 2016)
Revolution Analytics
 
R for data analytics
VijayMohan Vasu
 
The R Ecosystem
Revolution Analytics
 
Data Science At Zillow
Nicholas McClure
 
Predicting Loan Delinquency at One Million Transactions per Second
Revolution Analytics
 
R and-hadoop
Bryan Downing
 
R at Microsoft
Revolution Analytics
 
Meetup Oracle Database BCN: 2.1 Data Management Trends
avanttic Consultoría Tecnológica
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Cesare Cugnasco
 
Data Analytics Domain
Multisoft Virtual Academy
 

Viewers also liked (15)

PDF
NoSQL databases
Marin Dimitrov
 
PPTX
Retail Business Software
jsmith786
 
PPTX
Supply Chain Analytic Solution
jsmith786
 
PDF
R-Excel Integration
Andrija Djurovic
 
PPTX
Introduction to Cassandra (June 2010)
gdusbabek
 
PPT
INTRODUCTION TO SAS
Bhuwanesh Rawat
 
PPTX
Topic 4 intro spss_stata
Sizwan Ahammed
 
PDF
Introduction to SAS
izahn
 
PPTX
Sas demo
rvmfinishingschool
 
PDF
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Hortonworks
 
PPTX
Introduction to EpiData
Mohammad Nadir Sahak
 
PDF
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
Hortonworks
 
PPT
Spss lecture notes
David mbwiga
 
PDF
Data analysis using spss
Muhammad Ibrahim
 
PPT
Introduction to spss
Manish Parihar
 
NoSQL databases
Marin Dimitrov
 
Retail Business Software
jsmith786
 
Supply Chain Analytic Solution
jsmith786
 
R-Excel Integration
Andrija Djurovic
 
Introduction to Cassandra (June 2010)
gdusbabek
 
INTRODUCTION TO SAS
Bhuwanesh Rawat
 
Topic 4 intro spss_stata
Sizwan Ahammed
 
Introduction to SAS
izahn
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Hortonworks
 
Introduction to EpiData
Mohammad Nadir Sahak
 
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
Hortonworks
 
Spss lecture notes
David mbwiga
 
Data analysis using spss
Muhammad Ibrahim
 
Introduction to spss
Manish Parihar
 
Ad

Similar to Intro to R for SAS and SPSS User Webinar (20)

PPTX
LSESU a Taste of R Language Workshop
Korkrid Akepanidtaworn
 
PDF
An Analytics Toolkit Tour
Rory Winston
 
PPTX
R_L1-Aug-2022.pptx
ShantilalBhayal1
 
PDF
Revolution R Enterprise - 100% R and More
Revolution Analytics
 
PPTX
R and Rcmdr Statistical Software
arttan2001
 
PDF
UNIT-4 Start Learning R and installation .pdf
geethar79
 
PDF
UNIT-1 Start Learning R.pdf
Sweta Kumari Barnwal
 
PDF
2 it unit-1 start learning r
Netaji Gandi
 
PDF
Revolution R - 100% R and More
Revolution Analytics
 
PPTX
Big data analytics with R tool.pptx
salutiontechnology
 
PPTX
BIG DATA ANALYTICS USING R
Umair Shafique
 
PPTX
Revolution R Enterprise - 100% R and More Webinar Presentation
Revolution Analytics
 
PPTX
DOC-20240829-WA0001 power point presentation
AnkushKabir
 
PPTX
R programming language
Keerti Verma
 
PDF
R meet up slides.pptx
Corey Sparks
 
PPTX
R programming presentation
Akshat Sharma
 
PPTX
R programming Language , Rahul Singh
Ravi Basil
 
PDF
Introduction to R software, by Leire ibaibarriaga
DTU - Technical University of Denmark
 
PDF
The History and Use of R
AnalyticsWeek
 
PDF
Introtor
Kamakshaiah M
 
LSESU a Taste of R Language Workshop
Korkrid Akepanidtaworn
 
An Analytics Toolkit Tour
Rory Winston
 
R_L1-Aug-2022.pptx
ShantilalBhayal1
 
Revolution R Enterprise - 100% R and More
Revolution Analytics
 
R and Rcmdr Statistical Software
arttan2001
 
UNIT-4 Start Learning R and installation .pdf
geethar79
 
UNIT-1 Start Learning R.pdf
Sweta Kumari Barnwal
 
2 it unit-1 start learning r
Netaji Gandi
 
Revolution R - 100% R and More
Revolution Analytics
 
Big data analytics with R tool.pptx
salutiontechnology
 
BIG DATA ANALYTICS USING R
Umair Shafique
 
Revolution R Enterprise - 100% R and More Webinar Presentation
Revolution Analytics
 
DOC-20240829-WA0001 power point presentation
AnkushKabir
 
R programming language
Keerti Verma
 
R meet up slides.pptx
Corey Sparks
 
R programming presentation
Akshat Sharma
 
R programming Language , Rahul Singh
Ravi Basil
 
Introduction to R software, by Leire ibaibarriaga
DTU - Technical University of Denmark
 
The History and Use of R
AnalyticsWeek
 
Introtor
Kamakshaiah M
 
Ad

More from Revolution Analytics (20)

PPTX
Speeding up R with Parallel Programming in the Cloud
Revolution Analytics
 
PPTX
Migrating Existing Open Source Machine Learning to Azure
Revolution Analytics
 
PPTX
R in Minecraft
Revolution Analytics
 
PPTX
The case for R for AI developers
Revolution Analytics
 
PPTX
Speed up R with parallel programming in the Cloud
Revolution Analytics
 
PPTX
The R Ecosystem
Revolution Analytics
 
PPTX
R Then and Now
Revolution Analytics
 
PPTX
Reproducible Data Science with R
Revolution Analytics
 
PPTX
The Value of Open Source Communities
Revolution Analytics
 
PPTX
Building a scalable data science platform with R
Revolution Analytics
 
PPTX
The Business Economics and Opportunity of Open Source Data Science
Revolution Analytics
 
PPTX
The Network structure of R packages on CRAN & BioConductor
Revolution Analytics
 
PPTX
Simple Reproducibility with the checkpoint package
Revolution Analytics
 
PPTX
R at Microsoft
Revolution Analytics
 
PDF
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution Analytics
 
PDF
Warranty Predictive Analytics solution
Revolution Analytics
 
PPTX
Reproducibility with Checkpoint & RRO - NYC R Conference
Revolution Analytics
 
PDF
Reproducibility with Revolution R Open and the Checkpoint Package
Revolution Analytics
 
PPTX
Reproducibility with Revolution R Open
Revolution Analytics
 
PDF
In-Database Analytics Deep Dive with Teradata and Revolution
Revolution Analytics
 
Speeding up R with Parallel Programming in the Cloud
Revolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Revolution Analytics
 
R in Minecraft
Revolution Analytics
 
The case for R for AI developers
Revolution Analytics
 
Speed up R with parallel programming in the Cloud
Revolution Analytics
 
The R Ecosystem
Revolution Analytics
 
R Then and Now
Revolution Analytics
 
Reproducible Data Science with R
Revolution Analytics
 
The Value of Open Source Communities
Revolution Analytics
 
Building a scalable data science platform with R
Revolution Analytics
 
The Business Economics and Opportunity of Open Source Data Science
Revolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
Revolution Analytics
 
Simple Reproducibility with the checkpoint package
Revolution Analytics
 
R at Microsoft
Revolution Analytics
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution Analytics
 
Warranty Predictive Analytics solution
Revolution Analytics
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Revolution Analytics
 
Reproducibility with Revolution R Open and the Checkpoint Package
Revolution Analytics
 
Reproducibility with Revolution R Open
Revolution Analytics
 
In-Database Analytics Deep Dive with Teradata and Revolution
Revolution Analytics
 

Recently uploaded (20)

PDF
0725.WHITEPAPER-UNIQUEWAYSOFPROTOTYPINGANDUXNOW.pdf
Thomas GIRARD, MA, CDP
 
PPTX
Cultivation practice of Litchi in Nepal.pptx
UmeshTimilsina1
 
PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
PDF
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
PPTX
Pyhton with Mysql to perform CRUD operations.pptx
Ramakrishna Reddy Bijjam
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PPTX
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
PPTX
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
PDF
The dynastic history of the Chahmana.pdf
PrachiSontakke5
 
PPTX
Soil and agriculture microbiology .pptx
Keerthana Ramesh
 
PPSX
HEALTH ASSESSMENT (Community Health Nursing) - GNM 1st Year
Priyanshu Anand
 
PPTX
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PDF
SSHS-2025-PKLP_Quarter-1-Dr.-Kerby-Alvarez.pdf
AishahSangcopan1
 
PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
PPTX
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
PDF
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
PDF
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
PDF
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
PPTX
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
0725.WHITEPAPER-UNIQUEWAYSOFPROTOTYPINGANDUXNOW.pdf
Thomas GIRARD, MA, CDP
 
Cultivation practice of Litchi in Nepal.pptx
UmeshTimilsina1
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
Pyhton with Mysql to perform CRUD operations.pptx
Ramakrishna Reddy Bijjam
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
The dynastic history of the Chahmana.pdf
PrachiSontakke5
 
Soil and agriculture microbiology .pptx
Keerthana Ramesh
 
HEALTH ASSESSMENT (Community Health Nursing) - GNM 1st Year
Priyanshu Anand
 
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
SSHS-2025-PKLP_Quarter-1-Dr.-Kerby-Alvarez.pdf
AishahSangcopan1
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 

Intro to R for SAS and SPSS User Webinar

  • 1.  What is R?  R’s Advantages  R’s Disadvantages  Installing and Maintaining R  Ways of Running R Bob Muenchen, Author R for SAS and SPSS Users,  An Example Program Co-Author R for Stata Users  Where to Learn More [email protected], http://r4stats.com Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 2  “The most powerful statistical computing language on the planet.” -Norman Nie, Developer of SPSS  Language + package + environment for graphics and data analysis  Free and open source  Created by Ross Ihaka & Robert Gentleman 1996 & extended by many more  An implementation of the S language by John Chambers and others  R has 4,950 add-ons, or nearly 100,000 procs 3 4
  • 2. 5 Source: r4stats.com/popularity 6 http://r4stats.com/popularity 1. Data input & management (data step) * SAS Approach; 2. Analytics & graphics procedures (proc step) DATA A; SET A; 3. Macro language logX = log(X); 4. Matrix language PROC REG; 5. Output management systems (ODS/OMS) MODEL Y = logX; R integrates these all seamlessly. # R Approach lm( Y ~ log(X) ) 7 8
  • 3.  Vast selection of analytics & graphics  New methods are available sooner  Many packages can run R (SAS, SPSS, Excel…)  Its object orientation “does the right thing”  Its language is powerful & fully integrated  Procedures you write are on an equal footing  It is the universal language of data analysis  It runs on any computer  Being open source, you can study and modify it  It is free 9 10 * Using SAS;  Language is somewhat harder to learn PROC TTEST DATA=classroom;  Help files are sparse & complex CLASS gender;  Must find R and its add-ons yourself VAR score;  Graphical user interfaces not as polished  Most R functions hold data in main memory # In R  Rule-of-thumb: 10 million values per gigabyte t.test(score ~ gender, data=classroom)  SAS/SPSS: billions of records  Several efforts underway to break R’s memory limit t.test(posttest, pretest , paired=TRUE, data=classroom) including Revolution Analytics’ distribution 11 12
  • 4.  Base R plus Recommended Packages like:  Email support is free, quick, 24-hours:  Base SAS, SAS/STAT, SAS/GRAPH, SAS/IML Studio  www.r-project.org/mail.html  SPSS Stat. Base, SPSS Stat. Advanced, Regression  Stackoverflow.com  Tested via extensive validation programs  Quora.com  But add-on packages written by…  Crossvalidated stats.stackexchange.com  Professor who invented the method? /questions/tagged/r  A student interpreting the method?  Phone support available commercially 13 14 1. Go to cran.r-project.org,  Comprehensive R Archive Network the Comprehensive R Archive Network  Crantastic.com 2. Download binaries for Base & run  Inside-R.org 3. Add-ons:  R4Stats.com install.packages(“myPackage”) 4. To update: update.packages() 15 16
  • 5. 17 18 19 20
  • 6.  Run code interactively  Submit code from Excel, SAS, SPSS,…  Point-n-click using Graphical User Interfaces (GUIs)  Batch mode 21 22 23 24
  • 7. Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 26 25 run ExportDataSetToR("mydata"); GET FILE=‘mydata.sav’. BEGIN PROGRAM R. submit/r; mydata <- spssdata.GetDataFromSPSS( mydata$workshop <- variables = c("workshop gender factor(mydata$workshop) q1 to q4"), summary(mydata) missingValueToNA = TRUE, endsubmit; row.label = "id" ) summary(mydata) END PROGRAM. 27 28
  • 8. 29 30 32 31
  • 9. 34 33  A company focused on R development & support  Run by SPSS founder Norman Nie  Their enhanced distribution of R: Revolution R Enterprise  Free for colleges and universities, including for outside consulting 35
  • 11. 43 44
  • 12. mydata <- read.csv("mydata.csv") > mydata <- read.csv("mydata.csv") print(mydata) > print(mydata) workshop gender q1 q2 q3 q4 mydata$workshop <- factor(mydata$workshop) 1 1 f 1 1 5 1 summary(mydata) 2 2 f 2 1 4 1 plot( mydata$q1, mydata$q4 ) 3 1 f 2 2 4 3 4 2 <NA> 3 1 NA 3 myModel <- lm( q4~q1+q2+q3, data=mydata ) 5 1 m 4 5 2 4 summary( myModel ) 6 2 m 5 4 5 5 anova( myModel ) 7 1 m 5 3 4 4 plot( myModel ) 8 2 m 4 5 5 5 45 46 > mydata$workshop <-factor(mydata$workshop) > summary(mydata) workshop gender 1:4 f :3 2:4 m :4 NA's:1 q1 q2 q3 q4 Min. :1.00 Min. :1.00 Min. :2.000 Min. :1.00 1st Qu.:2.00 1st Qu.:1.00 1st Qu.:4.000 1st Qu.:2.50 Median :3.50 Median :2.50 Median :4.000 Median :3.50 Mean :3.25 Mean :2.75 Mean :4.143 Mean :3.25 3rd Qu.:4.25 3rd Qu.:4.25 3rd Qu.:5.000 3rd Qu.:4.25 Max. :5.00 Max. :5.00 Max. :5.000 Max. :5.00 NA's :1.000 47 48
  • 13. > myModel <- lm(q4 ~ q1+q2+q3, data=mydata) > summary(myModel) Call: lm(formula = q4 ~ q1 + q2 + q3, data = mydata) Residuals: 1 2 3 5 6 7 8 -0.3113 -0.4261 0.9428 -0.1797 0.0765 0.0225 -0.1246 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.3243 1.2877 -1.028 0.379 q1 0.4297 0.2623 1.638 0.200 q2 0.6310 0.2503 2.521 0.086 q3 0.3150 0.2557 1.232 0.306 Multiple R-squared: 0.9299, Adjusted R-squared: 0.8598 F-statistic: 13.27 on 3 and 3 DF, p-value: 0.03084 49 Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 50 51 52
  • 14.  R for SAS and SPSS Users, Muenchen  R for Stata Users, Muenchen & Hilbe  R Through Excel: A Spreadsheet Interface for Statistics, Data Analysis, and Graphics, Heiberger & Neuwirth  Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery, Williams 53 54  R is powerful, extensible, free  Download it from CRAN  Academics download Revolution R Enterprise for free at www.revolutionanalytics.com  You run it many ways & from many packages [email protected]  Several graphical user interfaces are available  R's programming language is the way Slides: r4stats.com/misc/webinar Presentation: bit.ly/R-sas-spss to access its full power 55