가장 많이 묻는 면접 질문과 답변 & 온라인 테스트
면접 준비, 온라인 테스트, 튜토리얼, 라이브 연습을 위한 학습 플랫폼

집중 학습 경로, 모의고사, 면접 준비 콘텐츠로 실력을 키우세요.

WithoutBook은 주제별 면접 질문, 온라인 연습 테스트, 튜토리얼, 비교 가이드를 하나의 반응형 학습 공간으로 제공합니다.

Prepare Interview

PySpark 면접 질문과 답변

Ques 26. What is the purpose of the 'accumulator' in PySpark?

An 'accumulator' is a variable that can be used in parallel operations and is updated by multiple tasks. It is typically used for implementing counters or sums in distributed computing.

Example:

accumulator = spark.sparkContext.accumulator(0)

# Inside a transformation or action
accumulator.add(1)

도움이 되었나요? Add Comment View Comments
 

Ques 27. Explain the use of the 'broadcast' hint in PySpark.

The 'broadcast' hint is used to explicitly instruct PySpark to use a broadcast join strategy for better performance, especially when one DataFrame is significantly smaller than the other.

Example:

from pyspark.sql.functions import broadcast

result = df1.join(broadcast(df2), 'key')

도움이 되었나요? Add Comment View Comments
 

Ques 28. What is the purpose of the 'agg' method in PySpark?

The 'agg' method is used for aggregating data in a PySpark DataFrame. It allows you to perform various aggregate functions like sum, avg, max, min, etc., on specified columns.

Example:

result = df.agg({'Sales': 'sum', 'Quantity': 'avg'})

도움이 되었나요? Add Comment View Comments
 

Ques 29. How can you handle data skewness in PySpark?

Data skewness can be handled by using techniques like salting, bucketing, or using the 'broadcast' hint to distribute data more evenly across partitions.

Example:

df.write.option('skew_hint', 'true').parquet('output_path')

도움이 되었나요? Add Comment View Comments
 

Ques 30. Explain the purpose of the 'coalesce' method in PySpark.

The 'coalesce' method is used to reduce the number of partitions in a PySpark DataFrame. It helps in optimizing the performance when the number of partitions is unnecessarily large.

Example:

df_coalesced = df.coalesce(5)

도움이 되었나요? Add Comment View Comments
 

Most helpful rated by users:

Copyright © 2026, WithoutBook.