Apache Spark Pertanyaan dan Jawaban Wawancara
Question: What is the difference between a DataFrame and an RDD in Spark?Answer: A DataFrame is a distributed collection of data organized into named columns, similar to a relational table. An RDD (Resilient Distributed Dataset) is a low-level abstraction representing a distributed collection of objects.Example:
|
Simpan untuk Revisi
Bookmark item ini, tandai sebagai sulit, atau masukkan ke dalam set revisi.
Masuk untuk menyimpan bookmark, pertanyaan sulit, dan set revisi.
Apakah ini membantu? Ya Tidak
Most helpful rated by users:
- What is the purpose of the Spark SQL module?
- Explain the difference between narrow and wide transformations in Spark.