SAS Interview Questions and Answers
Freshers / Beginner level questions & answers
Ques 1. What is SAS?
SAS (Statistical Analysis System) is a software suite used for advanced analytics, business intelligence, data management, and predictive analytics.
Example:
SAS is used in various industries for tasks such as data analysis, reporting, and statistical modeling.
Ques 2. Explain the role of the WHERE statement in the DATA step.
The WHERE statement in the DATA step is used to filter observations based on specified conditions, allowing you to subset data within the data step.
Example:
DATA subset_dataset; SET original_dataset; WHERE age > 18; RUN;
Ques 3. What is the purpose of the SORT procedure in SAS?
The SORT procedure in SAS is used to sort a dataset based on one or more variables. It can be used to arrange data in ascending or descending order.
Example:
PROC SORT DATA=unsorted_dataset OUT=sorted_dataset; BY variable; RUN;
Ques 4. How do you concatenate datasets vertically in SAS?
The SET statement is used to concatenate datasets vertically in SAS. All datasets must have the same variables in the same order.
Example:
DATA combined_dataset; SET dataset1 dataset2; RUN;
Ques 5. What is the purpose of the FREQ procedure in SAS?
The FREQ procedure in SAS is used to generate frequency tables for categorical variables, showing the count and percentage of each category.
Example:
PROC FREQ DATA=dataset; TABLES category_variable; RUN;
Ques 6. How do you create a SAS macro variable?
A SAS macro variable can be created using the %LET statement or the %GLOBAL statement within a macro. It allows for the storage and reuse of values in a macro.
Example:
%LET my_variable = 100; /* Create a macro variable */ %PUT Value of my_variable: &my_variable;
Ques 7. Explain the purpose of the CONTENTS procedure in SAS.
The CONTENTS procedure in SAS is used to display information about the structure and attributes of a dataset, including variable names, types, lengths, and formats.
Example:
PROC CONTENTS DATA=dataset; RUN;
Intermediate / 1 to 5 years experienced level questions & answers
Ques 8. Explain the difference between PROC MEANS and PROC SUMMARY.
Both PROC MEANS and PROC SUMMARY are used for calculating summary statistics, but PROC SUMMARY provides more flexibility and advanced options compared to PROC MEANS.
Example:
PROC MEANS DATA=dataset; VAR variable; RUN;
PROC SUMMARY DATA=dataset; VAR variable; OUTPUT OUT=summary_dataset MEAN=mean_value; RUN;
Ques 9. What is the difference between WHERE and IF statements in SAS?
WHERE is used in data steps and procedures to subset observations, while IF is used within the data step to conditionally execute statements based on a specified condition.
Example:
DATA new_dataset; SET old_dataset; WHERE variable > 50; IF variable2 = 'Yes' THEN new_variable = 'Positive'; ELSE new_variable = 'Negative'; RUN;
Ques 10. Explain the concept of a macro in SAS.
A macro in SAS is a pre-processed program that generates SAS code. It allows for code reusability, parameterization, and simplifies repetitive tasks.
Example:
%macro PrintMessage;
%put Hello, this is a macro!;
%mend PrintMessage;
%PrintMessage;
Ques 11. What is the purpose of the RETAIN statement in SAS?
The RETAIN statement is used in the DATA step to prevent the automatic initialization of variables to missing values at the beginning of each iteration.
Example:
DATA new_dataset; SET old_dataset; RETAIN variable1 variable2; /* Retains the values of variable1 and variable2 across iterations */ variable3 = variable1 + variable2; RUN;
Ques 12. Explain the difference between CLASS and BY statements in PROC means.
The CLASS statement in PROC MEANS is used to specify categorical variables for subgroup analysis, while the BY statement is used for creating separate summary statistics for different levels of a variable.
Example:
PROC MEANS DATA=dataset; VAR variable; CLASS category_variable; RUN;
PROC MEANS DATA=dataset; VAR variable; BY grouping_variable; RUN;
Ques 13. What is the purpose of the MERGE statement in SAS?
The MERGE statement in SAS is used to combine two or more datasets based on a common variable. It performs a match-merge operation.
Example:
DATA merged_dataset; MERGE dataset1 dataset2; BY common_variable; RUN;
Ques 14. How do you debug SAS programs?
SAS programs can be debugged using techniques such as the PUT statement, PROC PRINT, and the DATA step debugger. Additionally, the %PUT statement can be used for macro debugging.
Example:
DATA debug_dataset; SET original_dataset; /* Add PUT statements for debugging */ PUT 'Value of variable:' variable; RUN;
Ques 15. What is the purpose of the FORMAT procedure in SAS?
The FORMAT procedure in SAS is used to create custom formats for variables, defining the appearance of data values in output reports.
Example:
PROC FORMAT; VALUE gender_fmt 1='Male' 2='Female'; RUN;
DATA formatted_dataset; SET original_dataset; FORMAT gender gender_fmt.; RUN;
Ques 16. What is the purpose of the ODS statement in SAS?
The ODS (Output Delivery System) statement in SAS is used to control the output destination and format of SAS output, such as creating HTML, PDF, or RTF files.
Example:
ODS HTML FILE='output.html'; PROC PRINT DATA=dataset; RUN; ODS HTML CLOSE;
Ques 17. Explain the concept of an informat in SAS.
An informat in SAS is used to read data into SAS variables during input. It defines the input format of raw data.
Example:
DATA formatted_dataset; INPUT name $20. age height; DATALINES; John 25 180 Alice 30 165 ; RUN;
Ques 18. What is the purpose of the SUM statement in the DATA step?
The SUM statement in the DATA step is used to accumulate the sum of numeric variables across observations within a BY group or across all observations.
Example:
DATA summary_dataset; SET original_dataset; BY category_variable; /* Create a running total of variable */ RUNNING_TOTAL + variable; IF LAST.category_variable THEN OUTPUT; RUN;
Ques 19. How can you identify and handle missing values in SAS?
Missing values can be identified using functions like NMISS or special operators like IS MISSING. Handling missing values involves using techniques such as imputation or excluding observations with missing values.
Example:
DATA non_missing_dataset; SET original_dataset; /* Exclude observations with missing values */ IF NOT MISSING(variable) THEN OUTPUT; RUN;
Ques 20. What is the purpose of the TRANSPOSE procedure in SAS?
The TRANSPOSE procedure in SAS is used to reorganize data, transforming rows into columns and vice versa. It is often used for reshaping datasets.
Example:
PROC TRANSPOSE DATA=input_dataset OUT=output_dataset; VAR variable; BY category_variable; ID observation_variable; RUN;
Ques 21. Explain the difference between SAS functions SCAN and SUBSTR.
SCAN is used to extract words from a string based on a specified delimiter, while SUBSTR is used to extract a substring from a given position in a string.
Example:
DATA parsed_dataset; SET original_dataset; /* Extract second word from variable */ new_variable = SCAN(variable, 2); /* Extract substring starting at position 3 */ another_variable = SUBSTR(variable, 3); RUN;
Ques 22. What is the purpose of the SQL procedure in SAS?
The SQL procedure in SAS is used to perform SQL queries on SAS datasets. It allows for data manipulation, retrieval, and aggregation using SQL syntax.
Example:
PROC SQL; SELECT variable1, AVG(variable2) AS avg_value FROM dataset GROUP BY variable1; QUIT;
Ques 23. Explain the difference between SAS formats and informats.
Formats are used to control the appearance of data values in output, while informats are used to read data values during input. They are applied using the FORMAT and INFORMAT statements, respectively.
Example:
DATA formatted_dataset; SET original_dataset; FORMAT date_variable DATE9.; /* Apply format to date variable */ INPUT name $20. age height; /* Apply informat to age and height variables */ INFORMAT age 3. height 5.; DATALINES; John 25 180 Alice 30 165 ; RUN;
Ques 24. What is the purpose of the SGPLOT procedure in SAS?
The SGPLOT procedure in SAS is used for creating statistical graphics and plots. It provides a versatile way to visualize data using different types of charts.
Example:
PROC SGPLOT DATA=dataset; SCATTER x=variable1 y=variable2; RUN;
Most helpful rated by users:
- How do you concatenate datasets vertically in SAS?
- Explain the purpose of the CONTENTS procedure in SAS.
- Explain the difference between PROC MEANS and PROC SUMMARY.
- What is the difference between WHERE and IF statements in SAS?
- Explain the concept of a macro in SAS.