Hadoop Series Part 21 - Multi Node cluster setup for hadoop 2.6.0 on fedora 20

Hadoop is running on a commodity hardware, it uses distributed computing techniques. To get maximum performance out of hadoop, its important to configure it on cluster.
Requirenment:
Minimum 2 systems. (1 Master and 1 Slave)
My Configuration:
Memory : 4GB
Processor : i5 
OS Type : Fedora 20 (64 bit)


Install Java:

pict@pict:~$ sudo apt-get update 
pict@pict:~$ sudo apt-get install openjdk-7-jre 
pict@pict:~$ sudo apt-get install openjdk-7-jdk 
pict@pict:~$ java -version

Create dedicated "hadoop" user on both master and slave so that you can seperate the hadoop installation from other services running on the same system.

pict@pict:~$ useradd hadoop
pict@pict:~$ passwd hadoop (set password for hadoop user)

Add nodes by editing /etc/hosts on both master and slave.

pict@pict:~$ gedit /etc/hosts
192.168.6.71 Master
192.168.6.72 Slave1

Configure ssh: 
To start and stop all the nodes in the hadoop cluster,hadoop runs a script.(start/stop-all.sh). Master node communicate with all other slave node, so passeordless communication is important and it is achieved by configuring ssh on both master and slave system.

pict@pict:~$ ssh-keygen -t rsa -P “”
pict@pict:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@Master   
pict@pict:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@Slave1 
pict@pict:~$ chmod 0600 ~/.ssh/authorized_keys
exit

Download and install Hadoop: 
(All configuration files are modified.)
Master Node and Slave Node : Download this folders by click here and copy this folders to "/usr/local/" path.
** Note : Only make change in "hadoop-env.sh" file set your JAVA_HOME path.

For more details of configuration files go to
"/usr/local/hadoop/etc/hadoop" directory:
core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml
master
slave
hadoop-env.sh //set your JAVA_HOME path 

Set all permission for hadoop folder in slave node.

pict@pict:$ su hadoop 
hadoop@hadoop: $ chmod 777 -R /usr/local/hadoop

For first time "tmp" folder in hadoop need to be clear.

hadoop@hadoop: $ rm -rf usr/localhadoop/tmp/*

Stop the firewalls(optional):

systemctl stop firewalld.service
systemctl stop iptables.service

Format the namenode:

hadoop@hadoop: /usr/local/hadoop# $bin/hadoop namenode -format

Start hadoop:

hadoop@hadoop: /usr/local/hadoop# $sbin/start-all.sh

Check proper installation: 

a) Master Node :
hadoop@hadoop:$ jps
Namenode
Datanode
NodeMananger
ResourceMananger
SecondaryNamenode

b) Slave Node :
hadoop@hadoop: $jps
Datanode
NodeMananger

DataNode Details: 

hadoop@hadoop: /usr/local/hadoop#bin/hadoop dfsadmin -report

Wow! You successfully completed one of the milestone of hadoop i.e. multinode installation of hadoop. If you have any doubt regarding installation please put your queries in comment section. I will help you. Stay tuned with us for more articles on hadoop.

1 comments:

Post a Comment