Sqoop Interview Questions and Answers
Intermediate / 1 to 5 years experienced level questions & answers
Ques 1. Explain the import command in Sqoop.
The import command in Sqoop is used to import data from a relational database into Hadoop.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --table mytable --target-dir /user/hadoop/mytable
Ques 2. How can you perform an incremental import in Sqoop?
Incremental imports in Sqoop can be done using the --incremental option. You need to specify the mode and the column to use for tracking changes.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --table mytable --incremental append --check-column id --last-value 100
Ques 3. Explain the export command in Sqoop.
The export command in Sqoop is used to export data from Hadoop to a relational database.
Example:
sqoop export --connect jdbc:mysql://localhost:3306/db --table mytable --export-dir /user/hadoop/mytable
Ques 4. What is the metastore in Sqoop?
The metastore in Sqoop is a central repository that stores metadata related to Sqoop jobs, such as saved jobs, connection information, and job history.
Ques 5. Explain the purpose of the --map-column-java option in Sqoop.
The --map-column-java option allows you to specify how the columns from the database table should be mapped to Java types during import.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --table mytable --map-column-java id=String,value=Double
Ques 6. What is the difference between the free-form query import and table-based import in Sqoop?
In free-form query import, you can specify a SQL query to extract data, while in table-based import, you directly import an entire table.
Ques 7. Explain the purpose of the --merge-key option in Sqoop.
The --merge-key option is used during the Sqoop merge operation to specify the columns used for identifying rows to merge.
Ques 8. Explain the purpose of the --query option in Sqoop.
The --query option allows you to specify a SQL SELECT statement to retrieve data during Sqoop import.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --query SELECT * FROM mytable WHERE $CONDITIONS --split-by id
Ques 9. Explain the purpose of the --boundary-query option in Sqoop.
The --boundary-query option allows you to specify a SQL query that is used to determine the range of values for the splitting column.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --table mytable --boundary-query SELECT MIN(id), MAX(id) FROM mytable
Ques 10. Explain the purpose of the --boundary-query option in Sqoop.
The --boundary-query option allows you to specify a SQL query that is used to determine the range of values for the splitting column.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --table mytable --boundary-query SELECT MIN(id), MAX(id) FROM mytable
Ques 11. How can you import data into Hive using Sqoop?
You can import data into Hive using Sqoop by specifying the --hive-import option along with the target Hive table using --hive-table.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --table mytable --hive-import --hive-table myhivetable
Ques 12. Explain the purpose of the --hive-overwrite option in Sqoop.
The --hive-overwrite option in Sqoop is used to overwrite existing data in the Hive table during import.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --table mytable --hive-import --hive-table myhivetable --hive-overwrite
Ques 13. How can you perform an export operation in Sqoop to update existing records?
To update existing records during export in Sqoop, you can use the --update-key option along with --update-mode option set to allowinsert.
Example:
sqoop export --connect jdbc:mysql://localhost:3306/db --table mytable --update-key id --update-mode allowinsert --export-dir /user/hadoop/mytable
Ques 14. Explain the purpose of the --hcatalog-database option in Sqoop.
The --hcatalog-database option specifies the HCatalog database name when importing data into HCatalog.
Example:
sqoop import --connect jdbc:mysql://localhost:3306/db --table mytable --hcatalog-import --hcatalog-database mydatabase
Most helpful rated by users: