Install hadoop-2.7.3 on Ubuntu 16.04


Introduction

In this article we are installing hadoop-2.7.3 on single node.Single system is sufficient to run all components of hadoop. But at production level hadoop cluster involves many more machines. Hadoop runs on any linux ditribution.

Install Java:

Hadoop is written in Java, so it requires Java Development Kit(JDK) on your system.
Check java is available on your system by following command:

$ javac
or
$ java -version

If either of this commands gives error then you have to install java first.
Steps to install java :

$ sudo apt-get update
$ sudo apt-get install openjdk-8-jre
$ sudo apt-get install openjdk-8-jdk
$ java -version

Once java is installed set JAVA_HOME
environmental variable in .bashrc file.
Before that how to find where java is installed? use following commands:

$ ls -l /etc/alternatives/javac
lrwxrwxrwx 1 root root 36 Nov 14 23:15 /etc/alternatives/javac -> /usr/lib/jvm/java-8-oracle/bin/javac

/usr/lib/jvm/java-8-oracle this is the location where java is installed.

export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export PATH=$PATH:$JAVA_HOME/bin

Now copy above two lines and paste at end in .bashrc file. Your java path may be different.

Configure ssh

Hadoop requires communication between multiple components present on one or more machines. We need to ensure that user we are using for hadoop can connect to required host without needing password. It can be done by using SSH
If ssh is not available on your system then install using following command:

$ sudo apt-get install ssh

After installation of ssh execute following commands:

$ ssh-keygen -t rsa -P ""
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
$ ssh localhost

Download hadoop

To dowanload hadoop click here.

Extract and move hadoop-2.7.3 folder to /usr/local/

$ mv hadoop-2.7.3 /usr/local

Configure .bashrc file

export HADOOP_HOME=/usr/local/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

Update .bashrc file to apply changes

$ source ~/.bashrc

Configuration

Modify hadoop-env.sh

Open hadoop-env.sh file and find export JAVA_HOME and set JAVA_HOME path.

$ vim /usr/local/hadoop-2.7.3/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-oracle

Modify core-site.xml

$ vim /usr/local/hadoop-2.7.3/etc/hadoop/core-site.xml
 <configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

Modify hdfs-site.xml

$ vim /usr/local/hadoop-2.7.3/etc/hadoop/core-site.xml
 <configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

Modify mapred-site.xml

  $ vim /usr/local/hadoop-2.7.3/etc/hadoop/mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

Modify yarn-site.xml

 $ vim /usr/local/hadoop-2.7.3/etc/hadoop/yarn-site.xml
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

Format namenode for first time

$ hadoop namenode -format

Start hadoop

 $ start-all.sh

Check proper installation of hadoop


 $ jps
6482 DataNode
5763 ResourceManager
6694 SecondaryNameNode
6937 NodeManager
7182 Jps
6367 NameNode

If all 5 commands are running properly means your hadoop is working properly. :)

Browse the web interface for the NameNode, ResourceManager


NameNode - http://localhost:50070/

SecondaryNameNode - http://localhost:50090/status.html

DataNode - http://localhost:50070/dfshealth.html#tab-datanode

ResourceManager - http://localhost:8088/

Stop Hadoop:
$ stop-all.sh

Thats all in this article, we learned about running hadoop locally.
Start exploring your interset in hadoop.

Dont forget to like and share this post!

3 comments:

Post a Comment