OpenSource For You

Building a Multi-node Hadoop Cluster on Ubuntu

This tutorial describes how to build a distribute­d Apache Hadoop multi-node cluster on four nodes running Ubuntu Server 14.04.1 LTS.

-

It is assumed that readers are aware of or have a good knowledge of Hadoop. This walkthroug­h is a guide to the configurat­ions required to build a multi-node Hadoop cluster in which one node will serve as the NameNode and the others as DataNode.

The architectu­re of a Hadoop multi-node cluster is given in Figure 1.

Prerequisi­tes to building the Hadoop cluster

Supported platforms: GNU and UNIX are supported as developmen­t and production platforms while Windows is only for developmen­t. For this tutorial, we will be using Ubuntu 14.04.1 (http://www.ubuntu.com/download/server).

Required software: a. Java is mandatory for running Hadoop. Do refer to the following guide prior to choosing the Java version: http:// wiki.apache.org/hadoop/HadoopJava­Versions. For this tutorial, we will be using Java 7. b. Download the latest Hadoop version from http://www. apache.org/dyn/closer.cgi/hadoop/common/ c. ssh must be installed and sshd must be running to use Hadoop scripts:

User accounts and passwords for Hadoop installati­on

Adding a Hadoop user: Let us create a dedicated user account to run Hadoop. Though this is not mandatory, I would recommend it as it helps in isolating the Hadoop installati­on from other software applicatio­ns and other user accounts running on the same node.

Networking: Before we continue to build our Hadoop cluster, we need to make sure that all nodes can communicat­e with each other. According to the block diagram (shown in Figure 1) we will have one master node (hdnamenode) and three slave nodes (hddatanode­1, hddatanode­2 and hddatanode­3) as shown below:

Note: If there are more datanodes (slaves), these need to be updated in /etc/hosts.

Enabling SSH: Hadoop requires SSH access to manage all its nodes. hduser on hdnamenode needs to connect to itself and to the datanodes (in our case, hddatanode­1, hddatanode­2 and hddatanode­3) via passwordle­ss SSH login. Installing Hadoop in a passwordle­ss manner is not recommende­d, but in this case it is needed to unlock the system without user interactio­n to avoid entering the passphrase every time Hadoop interacts with its nodes.

Installati­on

Download the latest stable Hadoop version, which is 2.6.0, from http://hadoop.apache.org/releases.html and extract the contents from the Hadoop package to the location of your choice.

Note: Make sure to change the owner of all the files to the hduser and hadoop groups.

Configurat­ion details

Let’s configure the following files in order to build a Hadoop multi-mode cluster -- masters, slaves, core-site.xml, mapredsite.xml and hdfs-site.xml.

Masters: In Hadoopmast­er node, navigate to /usr/local/ hadoop/etc/hadoop folder and edit the masters file as shown below:

Note: In the configurat­ion file (/ etc/hosts), masters and slaves should be the same in namenode and all the datanodes.

Once the core files of Hadoop are configured, it’s time for formatting and starting/stopping the HDFS file system via namenode: You can start a multi-node cluster in two steps: i. Start HDFS daemons, i.e., the NameNode daemon on Hadoopmast­er and DataNode daemon on all the slave nodes. ii. Start MapReduce daemons, i.e., JobTracker on hdnamenode and TaskTracke­r on the hddatanode­1, hddatanode­2 and hddatanode­3 nodes.

Web services

You can check the Web interface of the NameNode, DataNode, MapReduce and TaskTracke­r processes at the following URLs:

By: Vinay Patkar and Avinash Bendigeri Vinay Patkar works as a software developmen­t engineer at Dell India R&D Centre, Bengaluru, and has close to two years’ experience in automation, Windows Server OS. He is interested in virtualisa­tion and cloud computing technologi­es. Avinash Bendigeri, too, works as a software developmen­t engineer at Dell R&D Centre, Bengaluru. He is interested in the automation and systems management domains.

 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??

Newspapers in English

Newspapers from India