HDFS - Hadoop Distributed File System

Introduction

In this post we have discuss about Hadoop file system.
Hadoop is divided into two parts :
1) HDFS - Storing files
2) MapReduce - Processing of file.
HDFS: It is a special file system for storing large data on a cluster of commodity hardware in a streaming access pattern. Streaming access pattern means you can write once, read any number of times but can't change the content of that file once it is kept in HDFS.

What is the difference between Unix file system and HDFS?

In Unix file system default size of the block is of 4KB. Suppose your file is of 6KB, then you require 2 blocks in Unix each of 4KB. So total 8KB is used, but actually you require 6KB that extra 2KB is wasted. In HDFS default size of the block is of 64MB (128MB). Suppose your file is of 200MB, then HDFS requires 4 blocks (3 of 64MB + 1 of 8MB). Extra space of 56MB in last block is not engaged. Insted extra space is relieved.


Why block size is of 64MB(128MB) ?

NameNode maintaining metadata for each block in HDFS. If block size is kept small,maintenance of metadata itself would engage much more space relatively. To reduce this overhead block size is kept large enough. Large enough block size reduces network traffic. Due to large block size Hadoop can fetch at a time 64MB of data for processing.

Services in Hadoop 1.x

1. NameNode
2. SecondaryNameNode
3. JobTracker
4. DataNode
5. TaskTracker

First 3 services are master services and the last 2 services are slave services. Master service can communicate with each other. Similarly true for slave services. In master-slave communication, a NameNode can only communicate with a DataNode and a JobTracker can only communicate with a TaskTracker. Should a NameNode fail, SecondaryNameNode is available. However the failure of JobTracker is the single point of failure in hadoop architecture.

Detail working of HDFS:

1. The client has a file of size, say, 200MB and want to put in HDFS.
2. Client sends a request to a NameNode about list of available blocks.
3. NameNode, in response, provides the list of available blocks and maintains metadata of file (e.g. Block Number 1,3,5,7).
4. The client sends the file to available block and HDFS internally splits that file according to the block size.
5. This 200MB file is divided into a.txt(64MB), b.txt(64MB), c.txt(64MB) and d.txt(8MB). Default replication factor in hdfs is 3. Client puts the file a.txt in DataNode 1 (block 1) and this block is replicated into block numbered 2 and 4.
6. Block 4 sends an acknowledgement(ack) to block 2 and block 2 sends ack to block 1.
  7. Block 1 gives ack to client. This ack contains information about replication of block 1 data.
8. After a specific interval of time every block sends block report and a heartbeat (default time is 3 second) to NameNode.
9. The block report contains the metadata of the block and the heartbeat of a node symbolizes that node is alive.
10. If any DataNode in the cluster fails, NameNode allocates another node which contains that replicated information and manages another free node to maintain the replication factor.
  11. Succedingly, JobTracker comes into the picture.
12. JobTracker needs data for processing. So it contacts NameNode for block info. JobTracker sends code on particular DataNode for processing.
13. As in master slave communication JobTracker cannot directly contact DataNode, it contacts TaskTracker.
14. Map: The process where JobTracker sends information to TaskTracker.
15. TaskTracker then communicate with DataNode and processes the data available in DataNode.
16. Reducer resides in any of the DataNode that combines all o/p files.
17. Alongwith heartbeat, info about o/p file is stored in metadata.
18. The client will know o/p info by reading metadata and will directly fetch that o/p file.
19. If DataNode fails, then JobTracker assign the task to another DataNode where the replicated data is available.
How JobTracker tracks the alive status of TaskTracker? TaskTracker gives hearbeat to JobTracker after every 3 second.

6 comments:

Post a Comment