홈 / 면접 주제 / Hadoop

WithoutBook LIVE Mock Interviews Hadoop Related interview subjects: 74

Interview Questions and Answers

Know the top Hadoop interview questions and answers for freshers and experienced candidates to prepare for job interviews.

Total 40 questions

The Best LIVE Mock Interview - You should go through before interview

Know the top Hadoop interview questions and answers for freshers and experienced candidates to prepare for job interviews.

Interview Questions and Answers

Search a question to view the answer.

Ques 1

What is Hadoop?

Hadoop is a distributed computing platform. It is written in Java. It consists of the features like Google File System and MapReduce.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 2

What platform and Java version are required to run Hadoop?

Java 1.6.x or higher versions are good for Hadoop, preferably from Sun. Linux and Windows are the supported operating system for Hadoop, but BSD, Mac OS/X, and Solaris are more famous for working.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 3

What kind of Hardware is best for Hadoop?

Hadoop can run on a dual processor/ dual core machines with 4-8 GB RAM using ECC memory. It depends on the workflow needs.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 4

What are the most common input formats defined in Hadoop?

These are the most common input formats defined in Hadoop:

TextInputFormat
KeyValueInputFormat
SequenceFileInputFormat

TextInputFormat is a by default input format.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 5

How do you categorize a big data?

The big data can be categorized using the following features:

Volume
Velocity
Variety

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 6

Give the use of the bootstrap panel.

We use panels in bootstrap from the boxing of DOM components.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 7

What is the purpose of button groups?

Button groups are used for the placement of more than one buttons in the same line.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 8

Name the various types of lists supported by Bootstrap.

Ordered list
Unordered list
Definition list

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 9

Which command is used for the retrieval of the status of daemons running the Hadoop cluster?

The 'jps' command is used for the retrieval of the status of daemons running the Hadoop cluster.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 10

What is InputSplit in Hadoop? Explain.

When a Hadoop job runs, it splits input files into chunks and assigns each split to a mapper for processing. It is called the InputSplit.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 11

What is TextInputFormat in Hadoop?

In TextInputFormat, each line in the text file is a record. Value is the content of the line while Key is the byte offset of the line. For instance, Key: longWritable, Value: text

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 12

What is the SequenceFileInputFormat in Hadoop?

In Hadoop, SequenceFileInputFormat is used to read files in sequence. It is a specific compressed binary file format which passes data between the output of one MapReduce job to the input of some other MapReduce job.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 13

How many InputSplits is made by a Hadoop Framework?

Hadoop makes 5 splits as follows:

One split for 64K files
Two splits for 65MB files, and
Two splits for 127MB files

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 14

What is the use of RecordReader in Hadoop?

InputSplit is assigned with a work but doesn't know how to access it. The record holder class is totally responsible for loading the data from its source and convert it into keys pair suitable for reading by the Mapper. The RecordReader's instance can be defined by the Input Format.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 15

What is JobTracker in Hadoop?

JobTracker is a service within Hadoop which runs MapReduce jobs on the cluster.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 16

What are the Hadoop's three configuration files?

Following are the three configuration files in Hadoop:

core-site.xml
mapred-site.xml
hdfs-site.xml

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 17

What is WebDAV in Hadoop?

WebDAV is a set of extension to HTTP which is used to support editing and uploading files. On most operating system WebDAV shares can be mounted as filesystems, so it is possible to access HDFS as a standard filesystem by exposing HDFS over WebDAV.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 18

What is Sqoop in Hadoop?

Sqoop is a tool used to transfer data between the Relational Database Management System (RDBMS) and Hadoop HDFS. By using Sqoop, you can transfer data from RDBMS like MySQL or Oracle into HDFS as well as exporting data from HDFS file to RDBMS.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 19

What are the functionalities of JobTracker?

These are the main tasks of JobTracker:

To accept jobs from the client.
To communicate with the NameNode to determine the location of the data.
To locate TaskTracker Nodes with available slots.
To submit the work to the chosen TaskTracker node and monitors the progress of each task.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 20

Define TaskTracker. What is TaskTracker in Hadoop?

TaskTracker is a node in the cluster that accepts tasks like MapReduce and Shuffle operations from a JobTracker.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 21

What is Map/Reduce job in Hadoop?

Map/Reduce job is a programming paradigm which is used to allow massive scalability across the thousands of server.

MapReduce refers to two different and distinct tasks that Hadoop performs. In the first step maps jobs which takes the set of data and converts it into another set of data and in the second step, Reduce job. It takes the output from the map as input and compresses those data tuples into the smaller set of tuples.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 22

What is "map" and what is "reducer" in Hadoop?

Map: In Hadoop, a map is a phase in HDFS query solving. A map reads data from an input location and outputs a key-value pair according to the input type.

Reducer: In Hadoop, a reducer collects the output generated by the mapper, processes it, and creates a final output of its own.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 23

What is shuffling in MapReduce?

Shuffling is a process which is used to perform the sorting and transfer the map outputs to the reducer as input.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 24

What is NameNode in Hadoop?

NameNode is a node, where Hadoop stores all the file location information in HDFS (Hadoop Distributed File System). We can say that NameNode is the centerpiece of an HDFS file system which is responsible for keeping the record of all the files in the file system, and tracks the file data across the cluster or multiple machines.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 25

What is heartbeat in HDFS?

Heartbeat is a signal which is used between a data node and name node, and between task tracker and job tracker. If the name node or job tracker doesn't respond to the signal then it is considered that there is some issue with data node or task tracker.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 26

How is indexing done in HDFS?

There is a very unique way of indexing in Hadoop. Once the data is stored as per the block size, the HDFS will keep on storing the last part of the data which specifies the location of the next part of the data.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 27

What happens when a data node fails?

If a data node fails the job tracker and name node will detect the failure. After that, all tasks are re-scheduled on the failed node and then name node will replicate the user data to another node.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 28

What is Hadoop Streaming?

Hadoop streaming is a utility which allows you to create and run map/reduce job. It is a generic API that allows programs written in any languages to be used as Hadoop mapper.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 29

What is a combiner in Hadoop?

A Combiner is a mini-reduce process which operates only on data generated by a Mapper. When Mapper emits the data, combiner receives it as input and sends the output to a reducer.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 30

What are the network requirements for using Hadoop?

Following are the network requirement for using Hadoop:

Password-less SSH connection.
Secure Shell (SSH) for launching server processes.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 31

What do you know by storage and compute node?

Storage node: Storage Node is the machine or computer where your file system resides to store the processing data.

Compute Node: Compute Node is a machine or computer where your actual business logic will be executed.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 32

Is it necessary to know Java to learn Hadoop?

If you have a background in any programming language like C, C++, PHP, Python, Java, etc. It may be really helpful, but if you are nil in java, it is necessary to learn Java and also get the basic knowledge of SQL.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 33

How to debug Hadoop code?

There are many ways to debug Hadoop codes but the most popular methods are:

By using Counters.
By web interface provided by the Hadoop framework.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 34

Is it possible to provide multiple inputs to Hadoop? If yes, explain.

Yes, It is possible. The input format class provides methods to insert multiple directories as input to a Hadoop job.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 35

What is the relation between job and task in Hadoop?

In Hadoop, A job is divided into multiple small parts known as the task.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 36

What is the difference between Input Split and HDFS Block?

The Logical division of data is called Input Split and physical division of data is called HDFS Block.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 37

What is the difference between HDFS and NAS?

HDFS data blocks are distributed across local drives of all machines in a cluster whereas, NAS data is stored on dedicated hardware.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 38

What is the difference between Hadoop and other data processing tools?

Hadoop facilitates you to increase or decrease the number of mappers without worrying about the volume of data to be processed.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 39

What is distributed cache in Hadoop?

Distributed cache is a facility provided by MapReduce Framework. It is provided to cache files (text, archives etc.) at the time of execution of the job. The Framework copies the necessary files to the slave node before the execution of any task at that node.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Ques 40

What is the functionality of JobTracker in Hadoop? How many instances of a JobTracker run on Hadoop cluster?

JobTracker is a giant service which is used to submit and track MapReduce jobs in Hadoop. Only one JobTracker process runs on any Hadoop cluster. JobTracker runs it within its own JVM process.

Functionalities of JobTracker in Hadoop:

When client application submits jobs to the JobTracker, the JobTracker talks to the NameNode to find the location of the data.
It locates TaskTracker nodes with available slots for data.
It assigns the work to the chosen TaskTracker nodes.
The TaskTracker nodes are responsible to notify the JobTracker when a task fails and then JobTracker decides what to do then. It may resubmit the task on another node or it may mark that task to avoid.

복습용 저장

이 항목을 북마크하거나, 어렵게 표시하거나, 복습 세트에 넣을 수 있습니다.

내 학습 라이브러리 열기

도움이 되었나요? 예 아니요

Add Comment View Comments

Most helpful rated by users:

Related interview subjects

All interview subjects

집중 학습 경로, 모의고사, 면접 준비 콘텐츠로 실력을 키우세요.

Interview Questions and Answers

The Best LIVE Mock Interview - You should go through before interview

Interview Questions and Answers

Freshers / Beginner level questions & answers

What is Hadoop?

복습용 저장

What platform and Java version are required to run Hadoop?

복습용 저장

What kind of Hardware is best for Hadoop?

복습용 저장

What are the most common input formats defined in Hadoop?

복습용 저장

How do you categorize a big data?

복습용 저장

Give the use of the bootstrap panel.

복습용 저장

What is the purpose of button groups?

복습용 저장

Name the various types of lists supported by Bootstrap.

복습용 저장

Which command is used for the retrieval of the status of daemons running the Hadoop cluster?

복습용 저장

What is InputSplit in Hadoop? Explain.

복습용 저장

What is TextInputFormat in Hadoop?

복습용 저장

Intermediate / 1 to 5 years experienced level questions & answers

What is the SequenceFileInputFormat in Hadoop?

복습용 저장

How many InputSplits is made by a Hadoop Framework?

복습용 저장

What is the use of RecordReader in Hadoop?

복습용 저장

What is JobTracker in Hadoop?

복습용 저장

What are the Hadoop's three configuration files?

복습용 저장

Experienced / Expert level questions & answers

What is WebDAV in Hadoop?

복습용 저장

What is Sqoop in Hadoop?

복습용 저장

What are the functionalities of JobTracker?

복습용 저장

Define TaskTracker. What is TaskTracker in Hadoop?

복습용 저장

What is Map/Reduce job in Hadoop?

복습용 저장

What is "map" and what is "reducer" in Hadoop?

복습용 저장

What is shuffling in MapReduce?

복습용 저장

What is NameNode in Hadoop?

복습용 저장

What is heartbeat in HDFS?

복습용 저장

How is indexing done in HDFS?

복습용 저장

What happens when a data node fails?

복습용 저장

What is Hadoop Streaming?

복습용 저장

What is a combiner in Hadoop?

복습용 저장

What are the network requirements for using Hadoop?

복습용 저장

What do you know by storage and compute node?

복습용 저장

Is it necessary to know Java to learn Hadoop?

복습용 저장

How to debug Hadoop code?

복습용 저장

Is it possible to provide multiple inputs to Hadoop? If yes, explain.

복습용 저장

What is the relation between job and task in Hadoop?

복습용 저장

What is the difference between Input Split and HDFS Block?

복습용 저장

What is the difference between HDFS and NAS?