SAS Interview Questions and Answers
Intermediate / 1 to 5 years experienced level questions & answers
Ques 1. Explain the difference between PROC MEANS and PROC SUMMARY.
Both PROC MEANS and PROC SUMMARY are used for calculating summary statistics, but PROC SUMMARY provides more flexibility and advanced options compared to PROC MEANS.
Example:
PROC MEANS DATA=dataset; VAR variable; RUN;
PROC SUMMARY DATA=dataset; VAR variable; OUTPUT OUT=summary_dataset MEAN=mean_value; RUN;
Ques 2. What is the difference between WHERE and IF statements in SAS?
WHERE is used in data steps and procedures to subset observations, while IF is used within the data step to conditionally execute statements based on a specified condition.
Example:
DATA new_dataset; SET old_dataset; WHERE variable > 50; IF variable2 = 'Yes' THEN new_variable = 'Positive'; ELSE new_variable = 'Negative'; RUN;
Ques 3. Explain the concept of a macro in SAS.
A macro in SAS is a pre-processed program that generates SAS code. It allows for code reusability, parameterization, and simplifies repetitive tasks.
Example:
%macro PrintMessage;
%put Hello, this is a macro!;
%mend PrintMessage;
%PrintMessage;
Ques 4. What is the purpose of the RETAIN statement in SAS?
The RETAIN statement is used in the DATA step to prevent the automatic initialization of variables to missing values at the beginning of each iteration.
Example:
DATA new_dataset; SET old_dataset; RETAIN variable1 variable2; /* Retains the values of variable1 and variable2 across iterations */ variable3 = variable1 + variable2; RUN;
Ques 5. Explain the difference between CLASS and BY statements in PROC means.
The CLASS statement in PROC MEANS is used to specify categorical variables for subgroup analysis, while the BY statement is used for creating separate summary statistics for different levels of a variable.
Example:
PROC MEANS DATA=dataset; VAR variable; CLASS category_variable; RUN;
PROC MEANS DATA=dataset; VAR variable; BY grouping_variable; RUN;
Ques 6. What is the purpose of the MERGE statement in SAS?
The MERGE statement in SAS is used to combine two or more datasets based on a common variable. It performs a match-merge operation.
Example:
DATA merged_dataset; MERGE dataset1 dataset2; BY common_variable; RUN;
Ques 7. How do you debug SAS programs?
SAS programs can be debugged using techniques such as the PUT statement, PROC PRINT, and the DATA step debugger. Additionally, the %PUT statement can be used for macro debugging.
Example:
DATA debug_dataset; SET original_dataset; /* Add PUT statements for debugging */ PUT 'Value of variable:' variable; RUN;
Ques 8. What is the purpose of the FORMAT procedure in SAS?
The FORMAT procedure in SAS is used to create custom formats for variables, defining the appearance of data values in output reports.
Example:
PROC FORMAT; VALUE gender_fmt 1='Male' 2='Female'; RUN;
DATA formatted_dataset; SET original_dataset; FORMAT gender gender_fmt.; RUN;
Ques 9. What is the purpose of the ODS statement in SAS?
The ODS (Output Delivery System) statement in SAS is used to control the output destination and format of SAS output, such as creating HTML, PDF, or RTF files.
Example:
ODS HTML FILE='output.html'; PROC PRINT DATA=dataset; RUN; ODS HTML CLOSE;
Ques 10. Explain the concept of an informat in SAS.
An informat in SAS is used to read data into SAS variables during input. It defines the input format of raw data.
Example:
DATA formatted_dataset; INPUT name $20. age height; DATALINES; John 25 180 Alice 30 165 ; RUN;
Ques 11. What is the purpose of the SUM statement in the DATA step?
The SUM statement in the DATA step is used to accumulate the sum of numeric variables across observations within a BY group or across all observations.
Example:
DATA summary_dataset; SET original_dataset; BY category_variable; /* Create a running total of variable */ RUNNING_TOTAL + variable; IF LAST.category_variable THEN OUTPUT; RUN;
Ques 12. How can you identify and handle missing values in SAS?
Missing values can be identified using functions like NMISS or special operators like IS MISSING. Handling missing values involves using techniques such as imputation or excluding observations with missing values.
Example:
DATA non_missing_dataset; SET original_dataset; /* Exclude observations with missing values */ IF NOT MISSING(variable) THEN OUTPUT; RUN;
Ques 13. What is the purpose of the TRANSPOSE procedure in SAS?
The TRANSPOSE procedure in SAS is used to reorganize data, transforming rows into columns and vice versa. It is often used for reshaping datasets.
Example:
PROC TRANSPOSE DATA=input_dataset OUT=output_dataset; VAR variable; BY category_variable; ID observation_variable; RUN;
Ques 14. Explain the difference between SAS functions SCAN and SUBSTR.
SCAN is used to extract words from a string based on a specified delimiter, while SUBSTR is used to extract a substring from a given position in a string.
Example:
DATA parsed_dataset; SET original_dataset; /* Extract second word from variable */ new_variable = SCAN(variable, 2); /* Extract substring starting at position 3 */ another_variable = SUBSTR(variable, 3); RUN;
Ques 15. What is the purpose of the SQL procedure in SAS?
The SQL procedure in SAS is used to perform SQL queries on SAS datasets. It allows for data manipulation, retrieval, and aggregation using SQL syntax.
Example:
PROC SQL; SELECT variable1, AVG(variable2) AS avg_value FROM dataset GROUP BY variable1; QUIT;
Ques 16. Explain the difference between SAS formats and informats.
Formats are used to control the appearance of data values in output, while informats are used to read data values during input. They are applied using the FORMAT and INFORMAT statements, respectively.
Example:
DATA formatted_dataset; SET original_dataset; FORMAT date_variable DATE9.; /* Apply format to date variable */ INPUT name $20. age height; /* Apply informat to age and height variables */ INFORMAT age 3. height 5.; DATALINES; John 25 180 Alice 30 165 ; RUN;
Ques 17. What is the purpose of the SGPLOT procedure in SAS?
The SGPLOT procedure in SAS is used for creating statistical graphics and plots. It provides a versatile way to visualize data using different types of charts.
Example:
PROC SGPLOT DATA=dataset; SCATTER x=variable1 y=variable2; RUN;
Most helpful rated by users:
- How do you concatenate datasets vertically in SAS?
- Explain the purpose of the CONTENTS procedure in SAS.
- Explain the difference between PROC MEANS and PROC SUMMARY.
- What is the difference between WHERE and IF statements in SAS?
- Explain the concept of a macro in SAS.