SlideShare a Scribd company logo
3
Most read
4
Most read
5
Most read
SASTechies [email_address] http://www.sastechies.com
data finance.duejan;  set  finance.loans; Interest=amount*(rate/12);  run; SAS Data Set Finance.Loans  11/13/09 SAS Techies 2009 Account  Amount  Rate  Months  Payment 101-1092   22000  0.1000      60    467.43 101-1731  114000   0.0950     360    958.57 101-1289   10000    0.1050      36    325.02 101-3144    3500   0.1050      12    308.52
Each time the SET statement is executed, SAS reads one observation into the program data vector. SET reads all variables and all observations from the input data sets unless you tell SAS to do otherwise. A SET statement can contain multiple data sets; a DATA step can contain multiple SET statements.  SET < SAS-data-set(s)  <( data-set-options(s)  )>>  < options >;  11/13/09 SAS Techies 2009
SAS Techies 2009 data lab23.drug1h;  set research.cltrials;  if placebo='YES' ; run;  data lab23.drug1h;  set research.cltrials;  Where placebo='YES' ; run;  data lab23.drug1h;  set research.cltrials (  Where=( placebo='YES‘)) ; run;  data lab23.drug1h;  set A C; run;  11/13/09
data lab23.drug1h(drop=placebo); set research.cltrials (drop=triglycerides uricacid) ; if placebo='YES';  run;  data lab23.drug1h(drop=placebo) ; set research.cltrials (drop=triglycerides uricacid placebo); if placebo='YES'; run;  If you  don't  process certain variables and you don't want them to appear in the new data set, specify them in the DROP= option in the SET statement.  If you  do  need to process a variable in the original data set (in a subsetting IF statement, for example), you must specify the variable in the DROP= option in the DATA statement. Otherwise, the statement that is using the variable for processing causes an error.  SAS Techies 2009 11/13/09
SAS Techies 2009 Proc sort data=a;by num; Proc sort data=b;by num; data sharad;  merge a b; by num;  run;  data sharad;  set a b; run;  11/13/09
The DATA step provides a large number of other programming features for manipulating data sets. For example, you can use IF-THEN/ELSE logic to control processing based on one or more conditions  specify additional data set options  perform calculations  create new variables  process variables in arrays  use SAS functions  use special variables such as FIRST. and LAST.   to control processing.  You can also combine SAS data sets in other ways, including match merging, interleaving, one-to-one merging, and updating. SAS Techies 2009 11/13/09
DATA  output-SAS-data-set ; MERGE   SAS-data-set-1 SAS-data-set-2 ; BY  variable(s) ;   RUN;   produces an output data set that contains values from  all observations in all input data sets . In DATA step match-merging, all data sets to be merged must be  sorted or indexed  by the values of BY variable The common variable must have the  same type and length  in all data sets to be merged.  SAS Techies 2009 11/13/09        You can specify any number of input data sets in the MERGE statement.
PROC SORT  < DATA= SAS-data-set >   < OUT= SAS-data-set > <options> ;    BY   variable(s) ;   RUN; Interesting options -nodupkey -noduprecs -where statement Note:  If you don't use the OUT= option, PROC SORT permanently sorts the data set specified in the DATA= option   SAS Techies 2009 11/13/09 Obscc ID Age Sex Date 1 A001 21 m 05/22/75 2 A001 21 m 05/22/75 3 A003 24 f 08/17/72 4 A004 .   03/27/69 5 A005 44 f 02/24/52 6 A007 39 m 11/11/57 Obs ID Age Sex Date 1 A001 21 m 05/22/75 2 A001 32 m 06/15/63 3 A003 24 f 08/17/72 4 A004 .   03/27/69 5 A005 44 f 02/24/52 6 A007 39 m 11/11/57
data clinic.combined; merge clinic.demog (rename=(date=BirthDate)) clinic.visit (rename=(date=VisitDate)) ; by id;  If  Birthdate = ’05Mar2005’d ; Rename birthdate=somedate; run;  Note: when you rename you should be using the new name in that datastep. (RENAME=( old-variable-name = new-variable-name ))     where  the  RENAME=  option, in parentheses, follows the name of each data set that contains one or more variables to be renamed  old-variable-name  names the variable to be renamed  new-variable-name  specifies the new name for the variable.  You can rename any number of variables in each occurrence of the RENAME= option. SAS Techies 2009 11/13/09
data combined; merge clients  (in=A)  Amounts (in=B) ;  by Name; If A and B; run;  Note:If the expression is  true  for the observation, the current observation is written to the output data set.  (IN= variable )    where  the  IN=  option, in parentheses, follows the data set name  variable  names the variable to be created.  the IN= data set option to create and name a variable that indicates whether the data set contributed data to the current observation the subsetting IF statement to check the IN= values and output only those observations that appear in the data sets for which IN= is specified.  SAS Techies 2009 11/13/09
The Compilation Phase: Setting Up the New Data Set   To prepare to merge data sets, SAS software  reads the descriptor portions of data sets listed in the MERGE statement  reads the remainder of the DATA step program  creates the program data vector (PDV)  assigns a tracking pointer to each data set listed in the MERGE statement.  If variables with the same name appear in more than one data set, the variable from the first data set that contains the variable (in the order listed in the MERGE statement) determines the length of the variable. SAS Techies 2009 11/13/09
The Execution Phase:  After compiling the DATA step, SAS software sequentially match-merges observations by moving the pointers down each observation of each data set and checking to see  whether the BY values match .  If  Yes , the observations are written to the PDV in the order the data sets appear in the MERGE statement. (Remember that values of any like-named variable are overwritten by values of the like-named variable in subsequent data sets.) SAS software writes the combined observation to the new data set and retains the values in the PDV until the BY value changes in all the data sets.  SAS Techies 2009 11/13/09
If  No , SAS software determines which of the values comes first and writes the observation containing this value to the PDV. Then the observation is written to the new data set.  SAS Techies 2009 11/13/09
When the BY value changes in all the input data sets, the PDV is initialized to missing.  The DATA step merge continues to process every observation in each data set until it exhausts all observations in all data sets. SAS Techies 2009 11/13/09
Handling Unmatched Observations and Missing Values  By default,  all observations  written to the PDV, including observations with missing data and no matching BY values, are written to the output data set. (If you specify a subsetting IF statement to select observations, only those that meet the IF condition are written.)  If an observation contains  missing values for a variable , the observation in the output data set contains the missing values as well. Observations with missing values for the BY variable appear at the top of the output data set.  If an input data set  doesn't have any observations for a given value  of the common variable, the observation in the output data set contains missing values for the variables unique to that input data set.  SAS Techies 2009 11/13/09
SAS Techies 2009 11/13/09
The DATA step provides a large number of other programming features for manipulating data sets during match-merging. For example, you can  use IF-THEN/ELSE logic to control processing based on one or more conditions  specify additional data set options  perform calculations  create new variables  process variables in arrays  use SAS functions  use  special variables such as FIRST. and LAST. to control processing.  SAS Techies 2009 11/13/09
options pageno=1 nodate linesize=80 pagesize=60;  data testfile; Set some; by Drug Rx; If first.Drug then TRx=0; TRx+Rx; If last.Drug then output; Run; Drug Rx A  10  Output Testfile A  11  Drug  TRx B  11  A  21 B  12  B  23 When an observation is the first in a BY group, SAS sets the value of FIRST. variable  to 1 for the variable whose value changed, as well as for all of the variables that follow in the BY statement. For all other observations in the BY group, the value of FIRST. variable  is 0. Likewise, if the observation is the last in a BY group, SAS sets the value of LAST. variable  to 1 for the variable whose value changes on the next observation, as well as for all of the variables that follow in the BY statement. For all other observations in the BY group, the value of LAST. variable  is 0. For the last observation in a data set, the value of all LAST. variable  variables are set to 1.  SAS Techies 2009 11/13/09 FIRST.Drug FIRST.Rx LAST.Drug LAST.Rx 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1
System options apply to the datasets, output for the entire session. Can be overridden by Dataset options Can be declared anywhere except within Datalines/cards statements Ex: options compress=yes, obs=max Ex:  Options compress=no obs=max; data new; Set cool (obs=10,compress=yes); Run; Dataset options applies to that particular dataset only. CANNOT be overridden by system options.  Can be declared only with the dataset options Ex:  data new; Set cool (obs=10,compress=yes); Run; SAS Techies 2009 11/13/09
GOTO label; The GOTO statement tells SAS to jump immediately to the statement label that is indicated in the GOTO statement and to continue executing statements from that point until a RETURN statement is executed. A RETURN statement after a GO TO statement  returns execution to the beginning of the next DATA step iteration LINK   label ;  The LINK statement tells SAS to jump immediately to the statement label that is indicated in the LINK statement and to continue executing statements from that point until a RETURN statement is executed.  The RETURN statement  sends program control to the statement immediately following the LINK statement. SAS Techies 2009 11/13/09
SAS Techies 2009 LINK Statement data hydro; input type $ depth station $; if type ='aluv' then  link calcu; date=today();  return; calcu: if station='site_1' then elevatn=6650-depth;  else if station='site_2' then elevatn=5500-depth;  return;   datalines;  aluv 523 site_1 uppa 234 site_2  aluv 666 site_2 ... more data lines ... ; Goto Statement data info; input x; if 1<=x<=5 then  goto add;  put x=; return;  add:  sumx+x;  return; datalines;  7  4  323 ;  Run; 11/13/09

More Related Content

What's hot (20)

PPTX
Proc SQL in SAS Enterprise Guide 4.3
Mark Tabladillo
 
PPT
Understanding SAS Data Step Processing
guest2160992
 
PPT
SAS Macros part 1
venkatam
 
PDF
Proc report
eagebhart
 
PPTX
Report procedure
MaanasaS
 
DOCX
Sas practice programs
gowthami marreddy
 
PPT
Conditional statements in sas
venkatam
 
PPT
Base SAS Statistics Procedures
guest2160992
 
PPTX
SAS Macro
Sonal Shrivastav
 
PDF
Base SAS Full Sample Paper
Jimmy Rana
 
PPTX
Understanding sas data step processing.
Ravi Mandal, MBA
 
PPT
SAS Access / SAS Connect
guest2160992
 
DOCX
Complex queries in sql
Charan Reddy
 
DOC
64 interview questions
Tarikul Alam
 
PPT
Arrays in SAS
guest2160992
 
PPT
MySQL Functions
Compare Infobase Limited
 
PDF
A Step-By-Step Introduction to SAS Report Procedure
YesAnalytics
 
PPT
ABAP Open SQL & Internal Table
sapdocs. info
 
PPTX
SAS Tutorial Sorting Data.pptx
AvinabaMukherjee6
 
PPTX
Sap abap database table
Ducat
 
Proc SQL in SAS Enterprise Guide 4.3
Mark Tabladillo
 
Understanding SAS Data Step Processing
guest2160992
 
SAS Macros part 1
venkatam
 
Proc report
eagebhart
 
Report procedure
MaanasaS
 
Sas practice programs
gowthami marreddy
 
Conditional statements in sas
venkatam
 
Base SAS Statistics Procedures
guest2160992
 
SAS Macro
Sonal Shrivastav
 
Base SAS Full Sample Paper
Jimmy Rana
 
Understanding sas data step processing.
Ravi Mandal, MBA
 
SAS Access / SAS Connect
guest2160992
 
Complex queries in sql
Charan Reddy
 
64 interview questions
Tarikul Alam
 
Arrays in SAS
guest2160992
 
MySQL Functions
Compare Infobase Limited
 
A Step-By-Step Introduction to SAS Report Procedure
YesAnalytics
 
ABAP Open SQL & Internal Table
sapdocs. info
 
SAS Tutorial Sorting Data.pptx
AvinabaMukherjee6
 
Sap abap database table
Ducat
 

Viewers also liked (16)

PPT
SAS Functions
guest2160992
 
PPT
Basics Of SAS Programming Language
guest2160992
 
PPT
Improving Effeciency with Options in SAS
guest2160992
 
PPT
SAS Proc SQL
guest2160992
 
PPT
SAS ODS HTML
guest2160992
 
PDF
Learning SAS by Example -A Programmer’s Guide by Ron CodySolution
Vibeesh CS
 
PPT
Sas Plots Graphs
guest2160992
 
DOCX
Sas Macro Examples
SASTechies
 
DOCX
Learn SAS Programming
SASTechies
 
PDF
Base SAS Exam Questions
guestc45097
 
PPT
Aen007 Kenigsberg 091807
Dreamforce07
 
PDF
Universities merging in France: the University of Lorraine as a case
Universidade Técnica de Lisboa
 
PPT
Reading Fixed And Varying Data
guest2160992
 
PPT
Where Vs If Statement
Sunil Gupta
 
PPSX
SAS TRAINING
Krishna Stansys
 
PPT
Interviewing Basics
dkaltved
 
SAS Functions
guest2160992
 
Basics Of SAS Programming Language
guest2160992
 
Improving Effeciency with Options in SAS
guest2160992
 
SAS Proc SQL
guest2160992
 
SAS ODS HTML
guest2160992
 
Learning SAS by Example -A Programmer’s Guide by Ron CodySolution
Vibeesh CS
 
Sas Plots Graphs
guest2160992
 
Sas Macro Examples
SASTechies
 
Learn SAS Programming
SASTechies
 
Base SAS Exam Questions
guestc45097
 
Aen007 Kenigsberg 091807
Dreamforce07
 
Universities merging in France: the University of Lorraine as a case
Universidade Técnica de Lisboa
 
Reading Fixed And Varying Data
guest2160992
 
Where Vs If Statement
Sunil Gupta
 
SAS TRAINING
Krishna Stansys
 
Interviewing Basics
dkaltved
 
Ad

Similar to Data Match Merging in SAS (20)

PPT
Sas-training-in-mumbai
Unmesh Baile
 
PDF
Sas
Shruti2016
 
PDF
SAS Commands
Suvojyoti Chowdhury
 
PDF
Twp Upgrading 10g To 11g What To Expect From Optimizer
qiw
 
PPTX
Introducción al Software Analítico SAS
Jorge Rodríguez M.
 
PDF
Merge vs sql join vs append (horizontal vs vertical) best
Araz Abbas Zadeh
 
DOCX
Base sas interview questions
Sunil0108
 
PDF
I need help with Applied Statistics and the SAS Programming Language.pdf
Madansilks
 
PDF
Introduction to SAS Data Set Options
Mark Tabladillo
 
PPT
Set and Merge
venkatam
 
PPT
SAS Online Training by Real Time Working Professionals in USA,UK,India,Middle...
Oracle Apps R12, Financials,SCM,PA,HRMSCorporate Training
 
PDF
Stata cheat sheet: data processing
Tim Essam
 
PDF
Cheat Sheet for Stata v15.00 PDF Complete
TsamaraLuthfia1
 
PDF
Introduction To Sas
halasti
 
PDF
DMAP Tutorial
Sohrab Kolsoumi
 
PDF
Power of call symput data
Yash Sharma
 
PDF
Stata Cheat Sheets (all)
Laura Hughes
 
PDF
Ebs stats
itshezz
 
Sas-training-in-mumbai
Unmesh Baile
 
SAS Commands
Suvojyoti Chowdhury
 
Twp Upgrading 10g To 11g What To Expect From Optimizer
qiw
 
Introducción al Software Analítico SAS
Jorge Rodríguez M.
 
Merge vs sql join vs append (horizontal vs vertical) best
Araz Abbas Zadeh
 
Base sas interview questions
Sunil0108
 
I need help with Applied Statistics and the SAS Programming Language.pdf
Madansilks
 
Introduction to SAS Data Set Options
Mark Tabladillo
 
Set and Merge
venkatam
 
SAS Online Training by Real Time Working Professionals in USA,UK,India,Middle...
Oracle Apps R12, Financials,SCM,PA,HRMSCorporate Training
 
Stata cheat sheet: data processing
Tim Essam
 
Cheat Sheet for Stata v15.00 PDF Complete
TsamaraLuthfia1
 
Introduction To Sas
halasti
 
DMAP Tutorial
Sohrab Kolsoumi
 
Power of call symput data
Yash Sharma
 
Stata Cheat Sheets (all)
Laura Hughes
 
Ebs stats
itshezz
 
Ad

Recently uploaded (20)

PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Biography of Daniel Podor.pdf
Daniel Podor
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 

Data Match Merging in SAS

  • 2. data finance.duejan; set finance.loans; Interest=amount*(rate/12); run; SAS Data Set Finance.Loans 11/13/09 SAS Techies 2009 Account Amount Rate Months Payment 101-1092  22000 0.1000     60   467.43 101-1731 114000  0.0950   360   958.57 101-1289  10000   0.1050     36   325.02 101-3144    3500  0.1050     12   308.52
  • 3. Each time the SET statement is executed, SAS reads one observation into the program data vector. SET reads all variables and all observations from the input data sets unless you tell SAS to do otherwise. A SET statement can contain multiple data sets; a DATA step can contain multiple SET statements. SET < SAS-data-set(s) <( data-set-options(s) )>> < options >; 11/13/09 SAS Techies 2009
  • 4. SAS Techies 2009 data lab23.drug1h; set research.cltrials; if placebo='YES' ; run; data lab23.drug1h; set research.cltrials; Where placebo='YES' ; run; data lab23.drug1h; set research.cltrials ( Where=( placebo='YES‘)) ; run; data lab23.drug1h; set A C; run; 11/13/09
  • 5. data lab23.drug1h(drop=placebo); set research.cltrials (drop=triglycerides uricacid) ; if placebo='YES'; run; data lab23.drug1h(drop=placebo) ; set research.cltrials (drop=triglycerides uricacid placebo); if placebo='YES'; run; If you don't process certain variables and you don't want them to appear in the new data set, specify them in the DROP= option in the SET statement. If you do need to process a variable in the original data set (in a subsetting IF statement, for example), you must specify the variable in the DROP= option in the DATA statement. Otherwise, the statement that is using the variable for processing causes an error. SAS Techies 2009 11/13/09
  • 6. SAS Techies 2009 Proc sort data=a;by num; Proc sort data=b;by num; data sharad; merge a b; by num; run; data sharad; set a b; run; 11/13/09
  • 7. The DATA step provides a large number of other programming features for manipulating data sets. For example, you can use IF-THEN/ELSE logic to control processing based on one or more conditions specify additional data set options perform calculations create new variables process variables in arrays use SAS functions use special variables such as FIRST. and LAST. to control processing. You can also combine SAS data sets in other ways, including match merging, interleaving, one-to-one merging, and updating. SAS Techies 2009 11/13/09
  • 8. DATA output-SAS-data-set ; MERGE   SAS-data-set-1 SAS-data-set-2 ; BY variable(s) ; RUN; produces an output data set that contains values from all observations in all input data sets . In DATA step match-merging, all data sets to be merged must be sorted or indexed by the values of BY variable The common variable must have the same type and length in all data sets to be merged. SAS Techies 2009 11/13/09        You can specify any number of input data sets in the MERGE statement.
  • 9. PROC SORT  < DATA= SAS-data-set > < OUT= SAS-data-set > <options> ;    BY variable(s) ; RUN; Interesting options -nodupkey -noduprecs -where statement Note: If you don't use the OUT= option, PROC SORT permanently sorts the data set specified in the DATA= option SAS Techies 2009 11/13/09 Obscc ID Age Sex Date 1 A001 21 m 05/22/75 2 A001 21 m 05/22/75 3 A003 24 f 08/17/72 4 A004 .   03/27/69 5 A005 44 f 02/24/52 6 A007 39 m 11/11/57 Obs ID Age Sex Date 1 A001 21 m 05/22/75 2 A001 32 m 06/15/63 3 A003 24 f 08/17/72 4 A004 .   03/27/69 5 A005 44 f 02/24/52 6 A007 39 m 11/11/57
  • 10. data clinic.combined; merge clinic.demog (rename=(date=BirthDate)) clinic.visit (rename=(date=VisitDate)) ; by id; If Birthdate = ’05Mar2005’d ; Rename birthdate=somedate; run; Note: when you rename you should be using the new name in that datastep. (RENAME=( old-variable-name = new-variable-name ))     where the RENAME= option, in parentheses, follows the name of each data set that contains one or more variables to be renamed old-variable-name names the variable to be renamed new-variable-name specifies the new name for the variable. You can rename any number of variables in each occurrence of the RENAME= option. SAS Techies 2009 11/13/09
  • 11. data combined; merge clients (in=A) Amounts (in=B) ; by Name; If A and B; run; Note:If the expression is true for the observation, the current observation is written to the output data set. (IN= variable )    where the IN= option, in parentheses, follows the data set name variable names the variable to be created. the IN= data set option to create and name a variable that indicates whether the data set contributed data to the current observation the subsetting IF statement to check the IN= values and output only those observations that appear in the data sets for which IN= is specified. SAS Techies 2009 11/13/09
  • 12. The Compilation Phase: Setting Up the New Data Set To prepare to merge data sets, SAS software reads the descriptor portions of data sets listed in the MERGE statement reads the remainder of the DATA step program creates the program data vector (PDV) assigns a tracking pointer to each data set listed in the MERGE statement. If variables with the same name appear in more than one data set, the variable from the first data set that contains the variable (in the order listed in the MERGE statement) determines the length of the variable. SAS Techies 2009 11/13/09
  • 13. The Execution Phase: After compiling the DATA step, SAS software sequentially match-merges observations by moving the pointers down each observation of each data set and checking to see whether the BY values match . If Yes , the observations are written to the PDV in the order the data sets appear in the MERGE statement. (Remember that values of any like-named variable are overwritten by values of the like-named variable in subsequent data sets.) SAS software writes the combined observation to the new data set and retains the values in the PDV until the BY value changes in all the data sets. SAS Techies 2009 11/13/09
  • 14. If No , SAS software determines which of the values comes first and writes the observation containing this value to the PDV. Then the observation is written to the new data set. SAS Techies 2009 11/13/09
  • 15. When the BY value changes in all the input data sets, the PDV is initialized to missing. The DATA step merge continues to process every observation in each data set until it exhausts all observations in all data sets. SAS Techies 2009 11/13/09
  • 16. Handling Unmatched Observations and Missing Values By default, all observations written to the PDV, including observations with missing data and no matching BY values, are written to the output data set. (If you specify a subsetting IF statement to select observations, only those that meet the IF condition are written.) If an observation contains missing values for a variable , the observation in the output data set contains the missing values as well. Observations with missing values for the BY variable appear at the top of the output data set. If an input data set doesn't have any observations for a given value of the common variable, the observation in the output data set contains missing values for the variables unique to that input data set. SAS Techies 2009 11/13/09
  • 17. SAS Techies 2009 11/13/09
  • 18. The DATA step provides a large number of other programming features for manipulating data sets during match-merging. For example, you can use IF-THEN/ELSE logic to control processing based on one or more conditions specify additional data set options perform calculations create new variables process variables in arrays use SAS functions use special variables such as FIRST. and LAST. to control processing. SAS Techies 2009 11/13/09
  • 19. options pageno=1 nodate linesize=80 pagesize=60; data testfile; Set some; by Drug Rx; If first.Drug then TRx=0; TRx+Rx; If last.Drug then output; Run; Drug Rx A 10 Output Testfile A 11 Drug TRx B 11 A 21 B 12 B 23 When an observation is the first in a BY group, SAS sets the value of FIRST. variable to 1 for the variable whose value changed, as well as for all of the variables that follow in the BY statement. For all other observations in the BY group, the value of FIRST. variable is 0. Likewise, if the observation is the last in a BY group, SAS sets the value of LAST. variable to 1 for the variable whose value changes on the next observation, as well as for all of the variables that follow in the BY statement. For all other observations in the BY group, the value of LAST. variable is 0. For the last observation in a data set, the value of all LAST. variable variables are set to 1. SAS Techies 2009 11/13/09 FIRST.Drug FIRST.Rx LAST.Drug LAST.Rx 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1
  • 20. System options apply to the datasets, output for the entire session. Can be overridden by Dataset options Can be declared anywhere except within Datalines/cards statements Ex: options compress=yes, obs=max Ex: Options compress=no obs=max; data new; Set cool (obs=10,compress=yes); Run; Dataset options applies to that particular dataset only. CANNOT be overridden by system options. Can be declared only with the dataset options Ex: data new; Set cool (obs=10,compress=yes); Run; SAS Techies 2009 11/13/09
  • 21. GOTO label; The GOTO statement tells SAS to jump immediately to the statement label that is indicated in the GOTO statement and to continue executing statements from that point until a RETURN statement is executed. A RETURN statement after a GO TO statement returns execution to the beginning of the next DATA step iteration LINK label ; The LINK statement tells SAS to jump immediately to the statement label that is indicated in the LINK statement and to continue executing statements from that point until a RETURN statement is executed. The RETURN statement sends program control to the statement immediately following the LINK statement. SAS Techies 2009 11/13/09
  • 22. SAS Techies 2009 LINK Statement data hydro; input type $ depth station $; if type ='aluv' then link calcu; date=today(); return; calcu: if station='site_1' then elevatn=6650-depth; else if station='site_2' then elevatn=5500-depth; return; datalines; aluv 523 site_1 uppa 234 site_2 aluv 666 site_2 ... more data lines ... ; Goto Statement data info; input x; if 1<=x<=5 then goto add; put x=; return; add: sumx+x; return; datalines; 7 4 323 ; Run; 11/13/09

Editor's Notes

  • #3: SASTechies.com Sharad C Narnindi - Attic Technologies 2005
  • #4: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #5: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #6: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #7: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #8: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #9: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #10: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #11: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #12: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #13: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #14: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #15: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #16: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #17: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #18: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #19: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #20: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #21: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #22: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005
  • #23: SASTechies.com Sharad C Narnindi Attic Technologies,Inc 2005