SlideShare a Scribd company logo
Sas-training-in-mumbai
*Built-in functions
*Data manipulation
*Updated often to include new applications
*Different packages complete certain tasks
more easily than others
*Packages we will introduce
*SAS
*R (S-plus)
*Easy to input and output data sets
*Preferred for data manipulation
*“proc” used to complete analyses with built-in
functions
*Macros used to build your own functions
*SAS Structure
*Efficient SAS Code for Large Files
*SAS Macro Facility
*Missing semicolon
*Misspelling
*Unmatched quotes/comments
*Mixed proc and data statement
*Using wrong options
*Data Step: input, create, manipulate or output
data
*Always start with a data line
*Ex. data one;
*Procedure Step: complete an operation on data
*Always start with a proc line
*Ex. proc contents;
*System options are global instructions that
affect the entire SAS session and control the
way SAS performs operations. SAS system
options differ from SAS data set options and
statement options in that once you invoke a
system option, it remains in effect for all
subsequent data and proc steps in a SAS job,
unless you specify them.
*In order to view which options are available and
in effect for your SAS session, use proc options;
run;
* center controls whether SAS procedure output is centered. By default, output is
centered. To specify not centered, use nocenter.
* date prints the date and time to the log and output window. By default, the date and
time is printed. To suppress the printing of the date, use nodate.
* label allows SAS procedures to use labels with variables. By default, labels are
permitted. To suppress the printing of labels, use nolabel.
* notes controls whether notes are printed to the SAS log. By default, notes are printed.
To suppress the printing of notes, use nonotes.
* number controls whether page numbers are printed. By default, page numbers are
printed. To suppress the printing of page numbers, use nonumber.
* linesize= specifies the line size (printer line width) for the SAS log and the SAS
procedure output file used by the data step and procedures.
* pagesize= specifies # of lines that can be printed per page of SAS output.
* missing= specifies the character to be printed for missing numeric values.
* formchar= specifies the the list of graphics characters that define table boundaries.
Example:
OPTIONS NOCENTER NODATE NONOTES LINESIZE=80 MISSING=. ;
SAS data set control options specify how SAS data
sets are input, processed, and output.
*firstobs= causes SAS to begin reading at a specified observation in a
data set. The default is firstobs=1.
*obs= specifies the last observation from a data set or the last
record from a raw data file that SAS is to read. To return to using
all observations in a data set use obs=all
*replace specifies whether permanently stored SAS data sets are to
be replaced. By default, the SAS system will over-write existing SAS
data sets if the SAS data set is re-specified in a data step. To
suppress this option, use noreplace.
Example:
*OPTIONS OBS=100 NOREPLACE;
Error handling options specify how the SAS System
reports on and recovers from error conditions.
* errors= controls the maximum number of observations for which
complete error messages are printed. The default maximum number
of complete error messages is errors=20
* fmterr controls whether the SAS System generates an error message
when the system cannot find a format to associate with a variable.
SAS will generate an ERROR message for every unknown format it
encounters and will terminate the SAS job without running any
following data and proc steps. To read a SAS system data set without
requiring a SAS format library, use nofmterr.
Example:
OPTIONS ERRORS=100 NOFMTERR;
*data statement names the data set you are
making
*Can use any of the following commands to
input data
*infile Identifies an external raw data file to read
with an INPUT statement
*input Lists variable names in the input file
*cards Indicates internal data
*set Reads a SAS data set
*To look at the variables in a data set, use
*proc contents data=dataset;
run;
*To look at the actual data in the data set,
*proc print data=dataset (obs=num);
var varlist;
run;
data treat;
infile “g:sharedBIO271treat.dat”;
input id bpa bpb chola cholb;
run;
proc print data = treat (obs=10);
run;
proc contents data=treat;
run;
*blank space (default)
*DELIMITER= option specifies that the INPUT
statement use a character other than a blank
as a delimiter for data values that are read
with list input
Sometimes you want to input the data yourself
Try the following data step:
data nums;
infile datalines dsd delimiter=‘&';
input X Y Z;
datalines;
1&2&3
4&5&6
7&8&9 ;
Notice that there are no semicolons until the
end of the datalines
*Another way to input data using the
keyboard (and often a last resort if having
problems input the data) is cards
*Similar to datalines
*data score;
input test1 test2 test3;
cards;
91 87 95
97 . 92
. 89 99
;
run;
*Sometimes your data will have characters
*Example:
data fam;
input name$ age;
cards;
Brian 27
Andrew 29
Kate 24
run;
proc print data=fam;
run;
*What is different and what happens if you
don’t have the dollar sign?
*The final way we will show to input data is if
you have a SAS data set , you can use a
libname command
libname summer "g:sharedbio271";
data treat2;
set summer.treat2;
run;
*Look at the data set with proc print
Variable label: Use the label statement in the data step
to assign labels to the variables.  You could also assign
labels to variables in proc steps, but then the labels only
exist for that step.  When labels are assigned in the data
step they are available for all procedures that use that
data set.
Example:
DATA labtreat;
SET treat;
LABEL id=“patient id” bpa =“BP on treatment A" bpb =“BP on treatment B"
cholA=“Cholesterol on treatment A” cholB=“Cholesterol on treatment B";
RUN;
PROC CONTENTS DATA=labtreat;
RUN;
*Make a data set with the following data calling
it redsox
*8, 58, 491, 163
7, 50, 469, 133
31, 107, 458, 136
33, 111, 410, 117
*Label the variables HR, RBI, AB, HITS
*Use proc print to ensure that you have input
the data correctly
*One of the best parts of SAS is the ability to complete data
manipulations
*There are four major types of manipulations
*Subset of data
*Drop / keep variables
*Drop observations
*Concatenate data files
*Merge data files
*Create new variables
*SAS easily allows you to make a data set
with a subset of the variables
*What do you think happens with this code?
DATA redsox2;
SET redsox;
KEEP ba rbi;
RUN;
*How do you think you could use drop to do
the same thing?
*We can also get a subset of the observations
*Read in treat2 from the g: drive
*This is helpful when we want to remove
missing data
DATA notreat2;
SET treat2;
IF cholA ^= . ;
RUN;
*SAS allows us to combine dataset by adding
more observations, using
data tottreat;
set treat treat2;
run;
*Check that it worked using proc print
*If a variable is called by a different name in
each dataset, you must use:
data momdad;
set dads(RENAME=(dadinc=inc)) moms(RENAME=(mominc=inc));
run;
*SAS also allows us to add more variables by merging
data files
*The data set demo gives demographic information
about the patients in treat
*Read in demo
*Now, use this code to combine the information
data extratreat;
merge treat demo;
by id;
run;
*Note: the data in each data set must be sorted to
use this code
*We can make new variables in a data step
*Let’s make a new variable in the redsox data set by
finding batting average and a variable for hr30
data redsox2;
set redsox;
ba=hits/ab;
if hr>=30 then hr30=1 else hr30=0;
run;
*Make a new data set called redsox3 using the
following data and combine it with redsox
7, 51, 378, 113
4, 41, 367, 99
20, 58, 361, 109
*Make a new variable in redsox3 that equals 1 if
rbi is more than 100 and 0 if rib is less than or
equal to 100
*file: Specifies the current output file for PUT
statements
*put: Writes lines to the SAS log, to the SAS
procedure output file, or to an external file
that is specified in the most recent FILE
statement.
Example:
data _null_;
set redsox;
file ‘p:redsox.csv' delimiter=',' dsd;
put hr rbi ab hits;
run;
*The INFILE statement specifies the input file for
any INPUT statements in the DATA step. The FILE
statement specifies the output file for any PUT
statements in the DATA step.
*Both the FILE and INFILE statements allow you to
use options that provide SAS with additional
information about the external file being used.
*An INFILE statement usually identifies data from an
external file. A DATALINES statement indicates that
data follow in the job stream. You can use the
INFILE statement with the file specification
DATALINES to take advantage of certain data-
reading options that effect how the INPUT
statement reads in-stream data.
*Missing values in SAS are shown by .
*As a general rule, SAS procedures that perform
computations handle missing data by omitting
the missing values, including proc means, proc
freq, proc corr, and proc reg
*Check SAS web page for more information
* SAS treats a missing value as the smallest
possible value (e.g., negative infinity) in
logical statements.
data times6;
set times ;
if (var1 <= 1.5) then varc1 = 0; else varc1 = 1 ;
run ;
Output:
Obs id var1 varc1
1 1 1.5 0
2 2 . 0
3 3 2.1 1
*proc print and proc contents- we have seen
these
*proc sort
*proc means
*proc univariate
*proc plot
*var: lists the variables you want to perform the
proc on
*by: breaks the data into groups
*where: limits the data set to a specific group
of observations
*output: allows you to output the results into a
data set
*We can use proc sort to sort data
*The code to complete this is
proc sort data=extratreat ; by gender ; run ;
proc sort data=extratreat out=extreat ; by gender ; run ;
proc sort data=extratreat out=extreat2; by descending
gender ; run ;
proc sort data=extratreat out=extreat3 noduplicates;
by gender ; run ;
*The basic form of proc means is
*proc means data=extratreat;
var ______;
by _______;
where _______;
output out=stat mean=bpamean cholamean;
run;
*The basic form of proc univariate is the
same, but much more information is given
*It is helpful to use the output window to
get the info you need
*To make different plots in SAS, you use
proc plot
*Scatterplot
*proc plot data=redsox;
plot rbi*ab;
run;
*You can also make plots using
*proc univariate data=redsox plot;
var rbi;
run;
*Find the mean blood pressure on treatment A
in women
*Make a scatterplot of blood pressure on
treatment B versus blood pressure on
treatment A in men
*Find the median number of home runs hit by
the Red Sox
Macros are the SAS method of making functions
*Avoid repetitious SAS code
*Create generalizable and flexible SAS code
*Pass information from one part of a SAS job to
another
*Conditionally execute data steps and PROCs
*SAS macro variable
*SAS Macro
*There are many discussions of macro variables
on the web; one good one is given here:
http://www2.sas.com/proceedings/sugi30/130
-30.pdf
Two delimiters will trigger the macro
processor in a SAS program.
*&macro-variable
This refers to a macro variable. The current
value of the variable will replace &macro-variable;
*%macro-name
This refers to a macro, which consists of one or
more complete SAS statements, or even whole data
or proc steps.
*SAS Macro variables can be defined and used
anywhere in a SAS program, except in data
lines. They are independent of a SAS dataset.
%LET: assign text to a macro variable;
%LET macrovar = value
1. Macrovar is the name of a global macro variable;
2. Value is macro variable value, which is a character
string without quotation or macro expression.
%PUT: display macro variable values as text
in the SAS log; %put _all_, %put _user_
&macrovar: Substitute the value of a macro
variable in a program;
*Here is an example of how to use a macro variable:
*%let int=treat;
proc means data=&int;
run;
*Now we can rerun the code again simply changing the
value of the macro variable, without altering the rest
of the code.
*%let int=redsox;
proc means data=&int;
run;
*This is extremely helpful when you have a large
amount of code you want to reference
*Definition:
%MACRO macro-name (parm1, parm2,…parmk);
Macro definition (&parm1,&parm2,…&parmk)
%MEND macro-name;
*Application:
%macro-name(values of parm1, parm2,…,parmk);
Import Excel to SAS Datasets by a Macro
%macro excelsas(in, out);
proc import out=work.&out
datafile=“g:sharedbio271&in"
dbms=excel replace;
getnames=yes; run;
%mend excelsas;
% excelsas(practice.xls,test)
Use proc print to ensure that you have the data
input properly
%let int=treat;
%let dop=%str(id bpa);
%macro happy;
data new;
set &int;
drop &dop;
run;
proc means data=new;
run;
%mend happy;
%happy
Sas-training-in-mumbai

More Related Content

PDF
Introduction to SAS Data Set Options
Mark Tabladillo
 
PDF
Set, merge, and update
ramesh Charantimath
 
PPT
Set and Merge
venkatam
 
PPT
SAS Functions
guest2160992
 
PDF
Sas cheat
imaduddin91
 
PPT
SAS Macros
guest2160992
 
PPT
Improving Effeciency with Options in SAS
guest2160992
 
PPTX
Getting Started with MySQL I
Sankhya_Analytics
 
Introduction to SAS Data Set Options
Mark Tabladillo
 
Set, merge, and update
ramesh Charantimath
 
Set and Merge
venkatam
 
SAS Functions
guest2160992
 
Sas cheat
imaduddin91
 
SAS Macros
guest2160992
 
Improving Effeciency with Options in SAS
guest2160992
 
Getting Started with MySQL I
Sankhya_Analytics
 

What's hot (20)

PDF
Export Data using R Studio
Rupak Roy
 
PPT
Tony jambu (obscure) tools of the trade for tuning oracle sq ls
InSync Conference
 
PPT
Arrays in SAS
guest2160992
 
PDF
Import Data using R
Rupak Roy
 
DOCX
Sas practice programs
gowthami marreddy
 
PPTX
Comparing SAS Files
Laura A Schild
 
DOCX
Oracle sql loader utility
nageswarareddapps
 
PDF
Stata Cheat Sheets (all)
Laura Hughes
 
DOCX
Sql loader good example
Aneel Swarna MBA ,PMP
 
PPTX
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
Massimo Cenci
 
PPTX
R Get Started I
Sankhya_Analytics
 
PPTX
R Get Started II
Sankhya_Analytics
 
PPT
Myth busters - performance tuning 101 2007
paulguerin
 
PPTX
Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...
Massimo Cenci
 
PPTX
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Massimo Cenci
 
PPT
Prog1 chap1 and chap 2
rowensCap
 
PDF
Aaa ped-6-Data manipulation: Data Files, and Data Cleaning & Preparation
AminaRepo
 
DOCX
DataBase Management System Lab File
Uttam Singh Chaudhary
 
PDF
R hive tutorial - apply functions and map reduce
Aiden Seonghak Hong
 
PPT
SAS Access / SAS Connect
guest2160992
 
Export Data using R Studio
Rupak Roy
 
Tony jambu (obscure) tools of the trade for tuning oracle sq ls
InSync Conference
 
Arrays in SAS
guest2160992
 
Import Data using R
Rupak Roy
 
Sas practice programs
gowthami marreddy
 
Comparing SAS Files
Laura A Schild
 
Oracle sql loader utility
nageswarareddapps
 
Stata Cheat Sheets (all)
Laura Hughes
 
Sql loader good example
Aneel Swarna MBA ,PMP
 
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
Massimo Cenci
 
R Get Started I
Sankhya_Analytics
 
R Get Started II
Sankhya_Analytics
 
Myth busters - performance tuning 101 2007
paulguerin
 
Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...
Massimo Cenci
 
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Massimo Cenci
 
Prog1 chap1 and chap 2
rowensCap
 
Aaa ped-6-Data manipulation: Data Files, and Data Cleaning & Preparation
AminaRepo
 
DataBase Management System Lab File
Uttam Singh Chaudhary
 
R hive tutorial - apply functions and map reduce
Aiden Seonghak Hong
 
SAS Access / SAS Connect
guest2160992
 
Ad

Viewers also liked (17)

DOCX
Mikey CV Final
Michael Tucker
 
DOCX
word
felipesoache
 
PDF
Universal Design
Scott Levy
 
PDF
Pepipost - Product Overview
Dibya Prakash Sahoo
 
PDF
2012アーバニズム_B1_政治家の生んだ都市_高橋樹_11N1082
11N1082
 
PPTX
Strona internetowa firmy
Software Special Forces
 
PPTX
Question two
baileyharland
 
PPTX
mapa conceptual
felipesoache
 
PDF
UNIT 7 INTERIORS
shilpi roy
 
DOC
IDCC 172 Accord formation professionnelle
Société Tripalio
 
PPTX
You've been framed
katiehatton123
 
PDF
How to Create a Winning Recruitment Strategy
CareerBuilder
 
PPT
土壌微生物によるバイオクロッギングの温度依存性について
Katsutoshi Seki
 
PPTX
The stone age
HST130mcc
 
PDF
PhD_dissertation_Niels_Erik_Olesen
Niels Erik Olesen
 
Mikey CV Final
Michael Tucker
 
Universal Design
Scott Levy
 
Pepipost - Product Overview
Dibya Prakash Sahoo
 
2012アーバニズム_B1_政治家の生んだ都市_高橋樹_11N1082
11N1082
 
Strona internetowa firmy
Software Special Forces
 
Question two
baileyharland
 
mapa conceptual
felipesoache
 
UNIT 7 INTERIORS
shilpi roy
 
IDCC 172 Accord formation professionnelle
Société Tripalio
 
You've been framed
katiehatton123
 
How to Create a Winning Recruitment Strategy
CareerBuilder
 
土壌微生物によるバイオクロッギングの温度依存性について
Katsutoshi Seki
 
The stone age
HST130mcc
 
PhD_dissertation_Niels_Erik_Olesen
Niels Erik Olesen
 
Ad

Similar to Sas-training-in-mumbai (20)

PDF
Introduction to-sas-1211594349119006-8
thotakoti
 
PDF
Introduction To Sas
halasti
 
PDF
SAS Commands
Suvojyoti Chowdhury
 
PPT
Sas classes in mumbai
Vibrant Technologies & Computers
 
PDF
SAS cheat sheet
Ali Ajouz
 
PPTX
Introducción al Software Analítico SAS
Jorge Rodríguez M.
 
PDF
I need help with Applied Statistics and the SAS Programming Language.pdf
Madansilks
 
PPT
Data Match Merging in SAS
guest2160992
 
DOCX
SAS Programming Notes
Gnana Murthy A
 
PPTX
sas.pptxnbhjghjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
AfiyaSheikh2
 
PPT
Basics Of SAS Programming Language
guest2160992
 
PPT
SAS - overview of SAS
Vibrant Technologies & Computers
 
PDF
Sas summary guide
Ashish K Sharma
 
PPT
SAS Online Training by Real Time Working Professionals in USA,UK,India,Middle...
Oracle Apps R12, Financials,SCM,PA,HRMSCorporate Training
 
PPT
Understanding SAS Data Step Processing
guest2160992
 
PDF
Draft sas and r and sas (may, 2018 asa meeting)
Barry DeCicco
 
PPTX
Aggregate.pptx
Ramakrishna Reddy Bijjam
 
PDF
R stata
Ajay Ohri
 
Introduction to-sas-1211594349119006-8
thotakoti
 
Introduction To Sas
halasti
 
SAS Commands
Suvojyoti Chowdhury
 
Sas classes in mumbai
Vibrant Technologies & Computers
 
SAS cheat sheet
Ali Ajouz
 
Introducción al Software Analítico SAS
Jorge Rodríguez M.
 
I need help with Applied Statistics and the SAS Programming Language.pdf
Madansilks
 
Data Match Merging in SAS
guest2160992
 
SAS Programming Notes
Gnana Murthy A
 
sas.pptxnbhjghjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
AfiyaSheikh2
 
Basics Of SAS Programming Language
guest2160992
 
SAS - overview of SAS
Vibrant Technologies & Computers
 
Sas summary guide
Ashish K Sharma
 
SAS Online Training by Real Time Working Professionals in USA,UK,India,Middle...
Oracle Apps R12, Financials,SCM,PA,HRMSCorporate Training
 
Understanding SAS Data Step Processing
guest2160992
 
Draft sas and r and sas (may, 2018 asa meeting)
Barry DeCicco
 
Aggregate.pptx
Ramakrishna Reddy Bijjam
 
R stata
Ajay Ohri
 

More from Unmesh Baile (20)

PPT
java-corporate-training-institute-in-mumbai
Unmesh Baile
 
PPT
Php mysql training-in-mumbai
Unmesh Baile
 
PPT
Java course-in-mumbai
Unmesh Baile
 
PPT
Robotics corporate-training-in-mumbai
Unmesh Baile
 
PPT
Corporate-training-for-msbi-course-in-mumbai
Unmesh Baile
 
PPT
Linux corporate-training-in-mumbai
Unmesh Baile
 
PPT
Professional dataware-housing-training-in-mumbai
Unmesh Baile
 
PPT
Best-embedded-corporate-training-in-mumbai
Unmesh Baile
 
PPTX
Selenium-corporate-training-in-mumbai
Unmesh Baile
 
PPT
Weblogic-clustering-failover-and-load-balancing-training
Unmesh Baile
 
PPT
Advance-excel-professional-trainer-in-mumbai
Unmesh Baile
 
PPT
Best corporate-r-programming-training-in-mumbai
Unmesh Baile
 
PPT
R-programming-training-in-mumbai
Unmesh Baile
 
PPT
Corporate-data-warehousing-training
Unmesh Baile
 
PPT
Microsoft-business-intelligence-training-in-mumbai
Unmesh Baile
 
PPT
Linux-training-for-beginners-in-mumbai
Unmesh Baile
 
PPT
Corporate-informatica-training-in-mumbai
Unmesh Baile
 
PPT
Corporate-informatica-training-in-mumbai
Unmesh Baile
 
PPT
Best-robotics-training-in-mumbai
Unmesh Baile
 
PPT
Best-embedded-system-classes-in-mumbai
Unmesh Baile
 
java-corporate-training-institute-in-mumbai
Unmesh Baile
 
Php mysql training-in-mumbai
Unmesh Baile
 
Java course-in-mumbai
Unmesh Baile
 
Robotics corporate-training-in-mumbai
Unmesh Baile
 
Corporate-training-for-msbi-course-in-mumbai
Unmesh Baile
 
Linux corporate-training-in-mumbai
Unmesh Baile
 
Professional dataware-housing-training-in-mumbai
Unmesh Baile
 
Best-embedded-corporate-training-in-mumbai
Unmesh Baile
 
Selenium-corporate-training-in-mumbai
Unmesh Baile
 
Weblogic-clustering-failover-and-load-balancing-training
Unmesh Baile
 
Advance-excel-professional-trainer-in-mumbai
Unmesh Baile
 
Best corporate-r-programming-training-in-mumbai
Unmesh Baile
 
R-programming-training-in-mumbai
Unmesh Baile
 
Corporate-data-warehousing-training
Unmesh Baile
 
Microsoft-business-intelligence-training-in-mumbai
Unmesh Baile
 
Linux-training-for-beginners-in-mumbai
Unmesh Baile
 
Corporate-informatica-training-in-mumbai
Unmesh Baile
 
Corporate-informatica-training-in-mumbai
Unmesh Baile
 
Best-robotics-training-in-mumbai
Unmesh Baile
 
Best-embedded-system-classes-in-mumbai
Unmesh Baile
 

Recently uploaded (20)

PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Doc9.....................................
SofiaCollazos
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
The Future of Artificial Intelligence (AI)
Mukul
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 

Sas-training-in-mumbai

  • 2. *Built-in functions *Data manipulation *Updated often to include new applications *Different packages complete certain tasks more easily than others *Packages we will introduce *SAS *R (S-plus)
  • 3. *Easy to input and output data sets *Preferred for data manipulation *“proc” used to complete analyses with built-in functions *Macros used to build your own functions
  • 4. *SAS Structure *Efficient SAS Code for Large Files *SAS Macro Facility
  • 5. *Missing semicolon *Misspelling *Unmatched quotes/comments *Mixed proc and data statement *Using wrong options
  • 6. *Data Step: input, create, manipulate or output data *Always start with a data line *Ex. data one; *Procedure Step: complete an operation on data *Always start with a proc line *Ex. proc contents;
  • 7. *System options are global instructions that affect the entire SAS session and control the way SAS performs operations. SAS system options differ from SAS data set options and statement options in that once you invoke a system option, it remains in effect for all subsequent data and proc steps in a SAS job, unless you specify them. *In order to view which options are available and in effect for your SAS session, use proc options; run;
  • 8. * center controls whether SAS procedure output is centered. By default, output is centered. To specify not centered, use nocenter. * date prints the date and time to the log and output window. By default, the date and time is printed. To suppress the printing of the date, use nodate. * label allows SAS procedures to use labels with variables. By default, labels are permitted. To suppress the printing of labels, use nolabel. * notes controls whether notes are printed to the SAS log. By default, notes are printed. To suppress the printing of notes, use nonotes. * number controls whether page numbers are printed. By default, page numbers are printed. To suppress the printing of page numbers, use nonumber. * linesize= specifies the line size (printer line width) for the SAS log and the SAS procedure output file used by the data step and procedures. * pagesize= specifies # of lines that can be printed per page of SAS output. * missing= specifies the character to be printed for missing numeric values. * formchar= specifies the the list of graphics characters that define table boundaries. Example: OPTIONS NOCENTER NODATE NONOTES LINESIZE=80 MISSING=. ;
  • 9. SAS data set control options specify how SAS data sets are input, processed, and output. *firstobs= causes SAS to begin reading at a specified observation in a data set. The default is firstobs=1. *obs= specifies the last observation from a data set or the last record from a raw data file that SAS is to read. To return to using all observations in a data set use obs=all *replace specifies whether permanently stored SAS data sets are to be replaced. By default, the SAS system will over-write existing SAS data sets if the SAS data set is re-specified in a data step. To suppress this option, use noreplace. Example: *OPTIONS OBS=100 NOREPLACE;
  • 10. Error handling options specify how the SAS System reports on and recovers from error conditions. * errors= controls the maximum number of observations for which complete error messages are printed. The default maximum number of complete error messages is errors=20 * fmterr controls whether the SAS System generates an error message when the system cannot find a format to associate with a variable. SAS will generate an ERROR message for every unknown format it encounters and will terminate the SAS job without running any following data and proc steps. To read a SAS system data set without requiring a SAS format library, use nofmterr. Example: OPTIONS ERRORS=100 NOFMTERR;
  • 11. *data statement names the data set you are making *Can use any of the following commands to input data *infile Identifies an external raw data file to read with an INPUT statement *input Lists variable names in the input file *cards Indicates internal data *set Reads a SAS data set
  • 12. *To look at the variables in a data set, use *proc contents data=dataset; run; *To look at the actual data in the data set, *proc print data=dataset (obs=num); var varlist; run;
  • 13. data treat; infile “g:sharedBIO271treat.dat”; input id bpa bpb chola cholb; run; proc print data = treat (obs=10); run; proc contents data=treat; run;
  • 14. *blank space (default) *DELIMITER= option specifies that the INPUT statement use a character other than a blank as a delimiter for data values that are read with list input
  • 15. Sometimes you want to input the data yourself Try the following data step: data nums; infile datalines dsd delimiter=‘&'; input X Y Z; datalines; 1&2&3 4&5&6 7&8&9 ; Notice that there are no semicolons until the end of the datalines
  • 16. *Another way to input data using the keyboard (and often a last resort if having problems input the data) is cards *Similar to datalines *data score; input test1 test2 test3; cards; 91 87 95 97 . 92 . 89 99 ; run;
  • 17. *Sometimes your data will have characters *Example: data fam; input name$ age; cards; Brian 27 Andrew 29 Kate 24 run; proc print data=fam; run; *What is different and what happens if you don’t have the dollar sign?
  • 18. *The final way we will show to input data is if you have a SAS data set , you can use a libname command libname summer "g:sharedbio271"; data treat2; set summer.treat2; run; *Look at the data set with proc print
  • 19. Variable label: Use the label statement in the data step to assign labels to the variables.  You could also assign labels to variables in proc steps, but then the labels only exist for that step.  When labels are assigned in the data step they are available for all procedures that use that data set. Example: DATA labtreat; SET treat; LABEL id=“patient id” bpa =“BP on treatment A" bpb =“BP on treatment B" cholA=“Cholesterol on treatment A” cholB=“Cholesterol on treatment B"; RUN; PROC CONTENTS DATA=labtreat; RUN;
  • 20. *Make a data set with the following data calling it redsox *8, 58, 491, 163 7, 50, 469, 133 31, 107, 458, 136 33, 111, 410, 117 *Label the variables HR, RBI, AB, HITS *Use proc print to ensure that you have input the data correctly
  • 21. *One of the best parts of SAS is the ability to complete data manipulations *There are four major types of manipulations *Subset of data *Drop / keep variables *Drop observations *Concatenate data files *Merge data files *Create new variables
  • 22. *SAS easily allows you to make a data set with a subset of the variables *What do you think happens with this code? DATA redsox2; SET redsox; KEEP ba rbi; RUN; *How do you think you could use drop to do the same thing?
  • 23. *We can also get a subset of the observations *Read in treat2 from the g: drive *This is helpful when we want to remove missing data DATA notreat2; SET treat2; IF cholA ^= . ; RUN;
  • 24. *SAS allows us to combine dataset by adding more observations, using data tottreat; set treat treat2; run; *Check that it worked using proc print *If a variable is called by a different name in each dataset, you must use: data momdad; set dads(RENAME=(dadinc=inc)) moms(RENAME=(mominc=inc)); run;
  • 25. *SAS also allows us to add more variables by merging data files *The data set demo gives demographic information about the patients in treat *Read in demo *Now, use this code to combine the information data extratreat; merge treat demo; by id; run; *Note: the data in each data set must be sorted to use this code
  • 26. *We can make new variables in a data step *Let’s make a new variable in the redsox data set by finding batting average and a variable for hr30 data redsox2; set redsox; ba=hits/ab; if hr>=30 then hr30=1 else hr30=0; run;
  • 27. *Make a new data set called redsox3 using the following data and combine it with redsox 7, 51, 378, 113 4, 41, 367, 99 20, 58, 361, 109 *Make a new variable in redsox3 that equals 1 if rbi is more than 100 and 0 if rib is less than or equal to 100
  • 28. *file: Specifies the current output file for PUT statements *put: Writes lines to the SAS log, to the SAS procedure output file, or to an external file that is specified in the most recent FILE statement. Example: data _null_; set redsox; file ‘p:redsox.csv' delimiter=',' dsd; put hr rbi ab hits; run;
  • 29. *The INFILE statement specifies the input file for any INPUT statements in the DATA step. The FILE statement specifies the output file for any PUT statements in the DATA step. *Both the FILE and INFILE statements allow you to use options that provide SAS with additional information about the external file being used. *An INFILE statement usually identifies data from an external file. A DATALINES statement indicates that data follow in the job stream. You can use the INFILE statement with the file specification DATALINES to take advantage of certain data- reading options that effect how the INPUT statement reads in-stream data.
  • 30. *Missing values in SAS are shown by . *As a general rule, SAS procedures that perform computations handle missing data by omitting the missing values, including proc means, proc freq, proc corr, and proc reg *Check SAS web page for more information
  • 31. * SAS treats a missing value as the smallest possible value (e.g., negative infinity) in logical statements. data times6; set times ; if (var1 <= 1.5) then varc1 = 0; else varc1 = 1 ; run ; Output: Obs id var1 varc1 1 1 1.5 0 2 2 . 0 3 3 2.1 1
  • 32. *proc print and proc contents- we have seen these *proc sort *proc means *proc univariate *proc plot
  • 33. *var: lists the variables you want to perform the proc on *by: breaks the data into groups *where: limits the data set to a specific group of observations *output: allows you to output the results into a data set
  • 34. *We can use proc sort to sort data *The code to complete this is proc sort data=extratreat ; by gender ; run ; proc sort data=extratreat out=extreat ; by gender ; run ; proc sort data=extratreat out=extreat2; by descending gender ; run ; proc sort data=extratreat out=extreat3 noduplicates; by gender ; run ;
  • 35. *The basic form of proc means is *proc means data=extratreat; var ______; by _______; where _______; output out=stat mean=bpamean cholamean; run; *The basic form of proc univariate is the same, but much more information is given *It is helpful to use the output window to get the info you need
  • 36. *To make different plots in SAS, you use proc plot *Scatterplot *proc plot data=redsox; plot rbi*ab; run; *You can also make plots using *proc univariate data=redsox plot; var rbi; run;
  • 37. *Find the mean blood pressure on treatment A in women *Make a scatterplot of blood pressure on treatment B versus blood pressure on treatment A in men *Find the median number of home runs hit by the Red Sox
  • 38. Macros are the SAS method of making functions *Avoid repetitious SAS code *Create generalizable and flexible SAS code *Pass information from one part of a SAS job to another *Conditionally execute data steps and PROCs
  • 39. *SAS macro variable *SAS Macro *There are many discussions of macro variables on the web; one good one is given here: http://www2.sas.com/proceedings/sugi30/130 -30.pdf
  • 40. Two delimiters will trigger the macro processor in a SAS program. *&macro-variable This refers to a macro variable. The current value of the variable will replace &macro-variable; *%macro-name This refers to a macro, which consists of one or more complete SAS statements, or even whole data or proc steps.
  • 41. *SAS Macro variables can be defined and used anywhere in a SAS program, except in data lines. They are independent of a SAS dataset.
  • 42. %LET: assign text to a macro variable; %LET macrovar = value 1. Macrovar is the name of a global macro variable; 2. Value is macro variable value, which is a character string without quotation or macro expression. %PUT: display macro variable values as text in the SAS log; %put _all_, %put _user_ &macrovar: Substitute the value of a macro variable in a program;
  • 43. *Here is an example of how to use a macro variable: *%let int=treat; proc means data=&int; run; *Now we can rerun the code again simply changing the value of the macro variable, without altering the rest of the code. *%let int=redsox; proc means data=&int; run; *This is extremely helpful when you have a large amount of code you want to reference
  • 44. *Definition: %MACRO macro-name (parm1, parm2,…parmk); Macro definition (&parm1,&parm2,…&parmk) %MEND macro-name; *Application: %macro-name(values of parm1, parm2,…,parmk);
  • 45. Import Excel to SAS Datasets by a Macro %macro excelsas(in, out); proc import out=work.&out datafile=“g:sharedbio271&in" dbms=excel replace; getnames=yes; run; %mend excelsas; % excelsas(practice.xls,test) Use proc print to ensure that you have the data input properly
  • 46. %let int=treat; %let dop=%str(id bpa); %macro happy; data new; set &int; drop &dop; run; proc means data=new; run; %mend happy; %happy