Statistics Interview Questions and Answers
Ques 11. Explain the concept of p-hacking.
P-hacking refers to the manipulation of statistical analyses, methods, or data to produce statistically significant results, often by testing multiple hypotheses until one reaches significance.
Example:
Conducting multiple tests on the same data until a significant result is found and then reporting only that result.
Ques 12. What is the difference between correlation and covariance?
Correlation is a standardized measure of the strength and direction of the linear relationship between two variables. Covariance measures the extent to which two variables change together, but it is not standardized.
Example:
Correlation coefficient ranges from -1 to 1; covariance can take any value.
Ques 13. Define multicollinearity in regression analysis.
Multicollinearity occurs when two or more independent variables in a regression model are highly correlated, making it difficult to identify the individual effect of each variable on the dependent variable.
Example:
In a regression predicting house prices, if square footage and number of bedrooms are strongly correlated, multicollinearity may occur.
Ques 14. What is a Q-Q plot used for?
A Q-Q plot (Quantile-Quantile plot) is used to assess whether a dataset follows a particular theoretical distribution, like the normal distribution. It compares the quantiles of the observed data to the quantiles of the expected distribution.
Example:
Checking if a set of exam scores follows a normal distribution using a Q-Q plot.
Ques 15. Explain the term 'power' in statistics.
Power is the probability that a statistical test will correctly reject a false null hypothesis. It is the ability of a test to detect an effect, given that the effect truly exists.
Example:
A study with a larger sample size generally has higher power to detect a true effect.
Most helpful rated by users: