SlideShare a Scribd company logo
東海大學資工系

Hadoop 2.2.0
Multi-node Installation on Ubuntu

康志強 G02357004
2014/1/3
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
一、前言......................................................................................................................................... 2
二、安裝環境................................................................................................................................. 3
三、安裝步驟................................................................................................................................. 4
1.

安裝環境說明................................................................................................................. 4

2.

設定................................................................................................................................. 5

3.

增加三台機器的 ip 和 hostname 的對應 .................................................................... 7

4.

打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入.................................. 8

5.

安裝 JDK ...................................................................................................................... 10

6.

關閉防火牆................................................................................................................... 11

7.

Hadoop 2.2 安裝 ......................................................................................................... 12

8.

Hadoop 2.2 啟動 ......................................................................................................... 18

五、本文的引用網址: ................................................................................................................. 24

1
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
一、前言
略

2
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
二、安裝環境
CPU

Intel Core i7-4470 3.40GHz

RAM

8 GB * 2

HD

128 SSD + 1TB HD

Network

100M/1000M bps Ethernet

OS

Windows7_64-bit

VM Platform

VMware® Workstation10.0.0 build-1295980

VM Guest OS

ubuntu-12.04.3-desktop-amd64

VMRAM

2.0GB

VM HD

40GB

3
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

三、安裝步驟
1. 安裝環境說明
這裡我們建構一個由三台機器組成的叢集
Hostname

User/Password

cloud001

hduser/adm123

cloud002

hduser/adm123

cloud003

hduser/adm123

Cluster 角色
Name node
Secondary Name node
Resource manager
Data node
Node manager
Data node
Node manager

4

OS
ubuntu-12.04.3 64
bits
ubuntu-12.04.3 64
bits
ubuntu-12.04.3 64
bits
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

2. 設定
(1) 修改 hostname,改成 cloud001
vim /etc/hostname

(2) 修改 hduser 權限 :
vim /etc/sudoers

(3) 系统升级到最新
sudo apt-get update

5
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
sudo apt-get upgrade

基本上先把 cloud001 裝好,再 clone 成 002,003 後,改 hotname 就可以了

6
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

3. 增加三台機器的 ip 和 hostname 的對應
hduser@cloud001:~$ vim /etc/hosts

7
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入
(1) 安裝 SSH
sudo apt-get install ssh

(2) 設置 local 無密碼登陸,在登入目錄下執行下面指令
建立.ssh 目錄,進入
hduser@ubuntu:~$ mkdir .ssh
hduser@ubuntu:~$ cd .ssh
產生金鑰(一直 Enter 就可以)
hduser@ubuntu:~/.ssh$ ssh-keygen -t rsa
把 id_rsa.pub 追加到授權的 key 裡面去
hduser@ubuntu:~/.ssh$cat id_rsa.pub >> authorized_keys
重啟 SSH 服務
hduser@ubuntu:~/.ssh$ service ssh restart
8
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
測試
ssh localhos

9
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

5. 安裝 JDK
下載 jdk-7u45-linux-x64.tar.gz,copy 到 /usr/lib/jvm, 執行 chmod
hduser@ubuntu:/usr/lib/jvm$ chmod 755 jdk-7u45-linux-x64.gz
安裝
hduser@ubuntu:/usr/lib/jvm$ sudo tar zxvf ./jdk-7u45-linux-x64.gz -C /usr/lib/jvm
環境變數
hduser@ubuntu:/usr/lib/jvm$ vim ~/.bashrc
最後面增加
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
輸入下面的命令來使之生效
hduser@ubuntu:/usr/lib/jvm$ source ~/.bashrc

10
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
測試
hduser@ubuntu:/usr/lib/jvm$ java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
hduser@ubuntu:/usr/lib/jvm$

6. 關閉防火牆
hduser@ubuntu:/usr/lib/jvm$ sudo ufw disable
Firewall stopped and disabled on system startup
hduser@ubuntu:/usr/lib/jvm$
重啟生效

11
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

7. Hadoop 2.2 安裝
(1) 下載檔案 hadoop-2.2.tar.gz,解壓到/home/hduser 路径下
hduser@ubuntu:~$ chmod 755 hadoop-2.2.0.tar.gz
hduser@ubuntu:~$ tar zxvf hadoop-2.2.0.tar.gz
(2) hadoop 配置
配置之前,需要在 cloud001 新增以下資料夾
/home/hduser/dfs/name
/home/hduser/dfs/data
/home/hduser/temp

修改相關設定擋案內容,清單如下
~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
~/hadoop-2.2.0/etc/hadoop/yarn-env.sh
~/hadoop-2.2.0/etc/hadoop/slaves
~/hadoop-2.2.0/etc/hadoop/core-site.xml
~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
~/hadoop-2.2.0/etc/hadoop/mapred-site.xml (不存在,直接 rename mapred-site.xml.temp)
~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
修改 hadoop-env.sh
修改 JAVA_HOME 值(export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45)

修改 yarn-env.sh
修改 JAVA_HOME 值(exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_45)

修改 slaves (這個文件裡面 KEEP 所有 slave 節點)
寫入以下內容:
cloud002

12
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
cloud003
修改 core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cloud001:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hduser/temp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>
</configuration>
13
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

修改 hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>cloud001:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
14
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
修改 mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>cloud001:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>cloud001:19888</value>
</property>
</configuration>
修改 yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

15
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>cloud001:8040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>cloud001:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>cloud001:8025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>cloud001:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>cloud001:8088</value>
</property>
</configuration>
設定環境變數
hduser@cloud001:~$ vim ~/.bashrc
16
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

最後面貼上
export HADOOP_HOME=/home/hduser/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

(3) clone imagecloud001 to cloud002 & cloud003 ,然後修改 hostname

17
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

8. Hadoop 2.2 啟動
(1) 進入安裝目錄: cd ~/hadoop-2.2.0/,格式化 namenode
./bin/hdfs namenode –format

18
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
(2) 啟動 hdfs
./sbin/start-dfs.sh
此時在 001 上面運行的進程有:namenode secondarynamenode
002 和 003 上面運行的進程有:datanode

(3) 啟動 yarn
./sbin/start-yarn.sh
此時在 001 上面運行的進程有:namenode secondarynamenoderesourcemanager
002 和 003 上面運行的進程有:datanode nodemanaget
19
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

20
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
(4) 查看叢集狀態
./bin/hdfs dfsadmin –report

(5) 查看文件組成
./bin/hdfs fsck / -files –blocks

21
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

(6) 查看 HDFS

(7) 查看 RM

22
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

23
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
五、本文的引用網址:

1. http://blog.csdn.net/licongcong_0224/article/details/12972889
2. http://blog.csdn.net/focusheart/article/details/14005893(單機板)
3. http://dawndiy.com/archives/155/ (Linux 下安装配置 JDK7)
4. http://www.ithome.com.tw/itadm/article.php?c=73978&s=1 (Hadoop 簡介)
5. http://www.runpc.com.tw/content/cloud_content.aspx?id=105318 (Hadoop 簡介)

24

More Related Content

PDF
Hadoop 3.1.1 single node
康志強 大人
 
PPT
Hadoop Installation
mrinalsingh385
 
PDF
Single node hadoop cluster installation
Mahantesh Angadi
 
PPTX
Hadoop installation
Ankit Desai
 
ODP
An example Hadoop Install
Mike Frampton
 
PPTX
Hadoop single node setup
Mohammad_Tariq
 
PPT
Running hadoop on ubuntu linux
TRCK
 
PPTX
Hadoop single cluster installation
Minh Tran
 
Hadoop 3.1.1 single node
康志強 大人
 
Hadoop Installation
mrinalsingh385
 
Single node hadoop cluster installation
Mahantesh Angadi
 
Hadoop installation
Ankit Desai
 
An example Hadoop Install
Mike Frampton
 
Hadoop single node setup
Mohammad_Tariq
 
Running hadoop on ubuntu linux
TRCK
 
Hadoop single cluster installation
Minh Tran
 

What's hot (20)

PDF
Hadoop completereference
arunkumar sadhasivam
 
PPTX
HADOOP 실제 구성 사례, Multi-Node 구성
Young Pyo
 
PPTX
Hadoop Cluster - Basic OS Setup Insights
Sruthi Kumar Annamnidu
 
PDF
Set up Hadoop Cluster on Amazon EC2
IMC Institute
 
PDF
Lab docker
Bruno Cornec
 
PDF
Setting up a HADOOP 2.2 cluster on CentOS 6
Manish Chopra
 
PPTX
How to create a secured multi tenancy for clustered ML with JupyterHub
Tiago Simões
 
PPTX
High Availability Server with DRBD in linux
Ali Rachman
 
PPTX
How to go the extra mile on monitoring
Tiago Simões
 
PDF
Introduction to Stacki at Atlanta Meetup February 2016
StackIQ
 
PDF
Puppet: Eclipsecon ALM 2013
grim_radical
 
PPTX
How to create a multi tenancy for an interactive data analysis with jupyter h...
Tiago Simões
 
DOCX
Component pack 6006 install guide
Roberto Boccadoro
 
PDF
Automated infrastructure is on the menu
jtimberman
 
PDF
NFD9 - Matt Peterson, Data Center Operations
Cumulus Networks
 
PDF
Out of the box replication in postgres 9.4
Denish Patel
 
PDF
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red_Hat_Storage
 
PDF
Docker and friends at Linux Days 2014 in Prague
tomasbart
 
PDF
Domino9on centos6
a8us
 
PDF
MySQL Replication: Demo Réplica en Español
Keith Hollman
 
Hadoop completereference
arunkumar sadhasivam
 
HADOOP 실제 구성 사례, Multi-Node 구성
Young Pyo
 
Hadoop Cluster - Basic OS Setup Insights
Sruthi Kumar Annamnidu
 
Set up Hadoop Cluster on Amazon EC2
IMC Institute
 
Lab docker
Bruno Cornec
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Manish Chopra
 
How to create a secured multi tenancy for clustered ML with JupyterHub
Tiago Simões
 
High Availability Server with DRBD in linux
Ali Rachman
 
How to go the extra mile on monitoring
Tiago Simões
 
Introduction to Stacki at Atlanta Meetup February 2016
StackIQ
 
Puppet: Eclipsecon ALM 2013
grim_radical
 
How to create a multi tenancy for an interactive data analysis with jupyter h...
Tiago Simões
 
Component pack 6006 install guide
Roberto Boccadoro
 
Automated infrastructure is on the menu
jtimberman
 
NFD9 - Matt Peterson, Data Center Operations
Cumulus Networks
 
Out of the box replication in postgres 9.4
Denish Patel
 
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red_Hat_Storage
 
Docker and friends at Linux Days 2014 in Prague
tomasbart
 
Domino9on centos6
a8us
 
MySQL Replication: Demo Réplica en Español
Keith Hollman
 
Ad

Similar to Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu (11)

PPTX
Hadoop cluster 安裝
recast203
 
PDF
Hadoop installation steps
Mayank Sharma
 
DOCX
Setup and run hadoop distrubution file system example 2.2
Mounir Benhalla
 
PDF
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Titus Damaiyanti
 
PDF
Deploy hadoop cluster
Chirag Ahuja
 
DOCX
Hadoop installation
habeebulla g
 
PDF
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Nag Arvind Gudiseva
 
PDF
Hadoop single node installation on ubuntu 14
jijukjoseph
 
PPT
Big data with hadoop Setup on Ubuntu 12.04
Mandakini Kumari
 
DOCX
Run wordcount job (hadoop)
valeri kopaleishvili
 
PDF
myHadoop 0.30
Glenn K. Lockwood
 
Hadoop cluster 安裝
recast203
 
Hadoop installation steps
Mayank Sharma
 
Setup and run hadoop distrubution file system example 2.2
Mounir Benhalla
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Titus Damaiyanti
 
Deploy hadoop cluster
Chirag Ahuja
 
Hadoop installation
habeebulla g
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Nag Arvind Gudiseva
 
Hadoop single node installation on ubuntu 14
jijukjoseph
 
Big data with hadoop Setup on Ubuntu 12.04
Mandakini Kumari
 
Run wordcount job (hadoop)
valeri kopaleishvili
 
myHadoop 0.30
Glenn K. Lockwood
 
Ad

More from 康志強 大人 (8)

PDF
AWS Lambda Multi-Cloud Practices
康志強 大人
 
DOCX
AWS CloudFront、S3 Streamming
康志強 大人
 
PPTX
Running Hadoop on Amazon EC2
康志強 大人
 
DOC
Tomcat ssl 設定
康志強 大人
 
DOCX
FreeNAS installation and setup for shared storage (1/2)
康志強 大人
 
DOCX
CloudStack Installation on Ubuntu
康志強 大人
 
DOCX
OpenSTACK Installation on Ubuntu
康志強 大人
 
PPTX
JackHare- a framework for SQL to NoSQL translation using MapReduce
康志強 大人
 
AWS Lambda Multi-Cloud Practices
康志強 大人
 
AWS CloudFront、S3 Streamming
康志強 大人
 
Running Hadoop on Amazon EC2
康志強 大人
 
Tomcat ssl 設定
康志強 大人
 
FreeNAS installation and setup for shared storage (1/2)
康志強 大人
 
CloudStack Installation on Ubuntu
康志強 大人
 
OpenSTACK Installation on Ubuntu
康志強 大人
 
JackHare- a framework for SQL to NoSQL translation using MapReduce
康志強 大人
 

Recently uploaded (20)

PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
Doc9.....................................
SofiaCollazos
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 

Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu

  • 1. 東海大學資工系 Hadoop 2.2.0 Multi-node Installation on Ubuntu 康志強 G02357004 2014/1/3
  • 2. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 一、前言......................................................................................................................................... 2 二、安裝環境................................................................................................................................. 3 三、安裝步驟................................................................................................................................. 4 1. 安裝環境說明................................................................................................................. 4 2. 設定................................................................................................................................. 5 3. 增加三台機器的 ip 和 hostname 的對應 .................................................................... 7 4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入.................................. 8 5. 安裝 JDK ...................................................................................................................... 10 6. 關閉防火牆................................................................................................................... 11 7. Hadoop 2.2 安裝 ......................................................................................................... 12 8. Hadoop 2.2 啟動 ......................................................................................................... 18 五、本文的引用網址: ................................................................................................................. 24 1
  • 3. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 一、前言 略 2
  • 4. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 二、安裝環境 CPU Intel Core i7-4470 3.40GHz RAM 8 GB * 2 HD 128 SSD + 1TB HD Network 100M/1000M bps Ethernet OS Windows7_64-bit VM Platform VMware® Workstation10.0.0 build-1295980 VM Guest OS ubuntu-12.04.3-desktop-amd64 VMRAM 2.0GB VM HD 40GB 3
  • 5. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 三、安裝步驟 1. 安裝環境說明 這裡我們建構一個由三台機器組成的叢集 Hostname User/Password cloud001 hduser/adm123 cloud002 hduser/adm123 cloud003 hduser/adm123 Cluster 角色 Name node Secondary Name node Resource manager Data node Node manager Data node Node manager 4 OS ubuntu-12.04.3 64 bits ubuntu-12.04.3 64 bits ubuntu-12.04.3 64 bits
  • 6. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 2. 設定 (1) 修改 hostname,改成 cloud001 vim /etc/hostname (2) 修改 hduser 權限 : vim /etc/sudoers (3) 系统升级到最新 sudo apt-get update 5
  • 7. Hadoop 2.2.0 (multi-node) Installation on Ubuntu sudo apt-get upgrade 基本上先把 cloud001 裝好,再 clone 成 002,003 後,改 hotname 就可以了 6
  • 8. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 3. 增加三台機器的 ip 和 hostname 的對應 hduser@cloud001:~$ vim /etc/hosts 7
  • 9. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入 (1) 安裝 SSH sudo apt-get install ssh (2) 設置 local 無密碼登陸,在登入目錄下執行下面指令 建立.ssh 目錄,進入 hduser@ubuntu:~$ mkdir .ssh hduser@ubuntu:~$ cd .ssh 產生金鑰(一直 Enter 就可以) hduser@ubuntu:~/.ssh$ ssh-keygen -t rsa 把 id_rsa.pub 追加到授權的 key 裡面去 hduser@ubuntu:~/.ssh$cat id_rsa.pub >> authorized_keys 重啟 SSH 服務 hduser@ubuntu:~/.ssh$ service ssh restart 8
  • 10. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 測試 ssh localhos 9
  • 11. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 5. 安裝 JDK 下載 jdk-7u45-linux-x64.tar.gz,copy 到 /usr/lib/jvm, 執行 chmod hduser@ubuntu:/usr/lib/jvm$ chmod 755 jdk-7u45-linux-x64.gz 安裝 hduser@ubuntu:/usr/lib/jvm$ sudo tar zxvf ./jdk-7u45-linux-x64.gz -C /usr/lib/jvm 環境變數 hduser@ubuntu:/usr/lib/jvm$ vim ~/.bashrc 最後面增加 export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH 輸入下面的命令來使之生效 hduser@ubuntu:/usr/lib/jvm$ source ~/.bashrc 10
  • 12. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 測試 hduser@ubuntu:/usr/lib/jvm$ java -version java version "1.7.0_45" Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) hduser@ubuntu:/usr/lib/jvm$ 6. 關閉防火牆 hduser@ubuntu:/usr/lib/jvm$ sudo ufw disable Firewall stopped and disabled on system startup hduser@ubuntu:/usr/lib/jvm$ 重啟生效 11
  • 13. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 7. Hadoop 2.2 安裝 (1) 下載檔案 hadoop-2.2.tar.gz,解壓到/home/hduser 路径下 hduser@ubuntu:~$ chmod 755 hadoop-2.2.0.tar.gz hduser@ubuntu:~$ tar zxvf hadoop-2.2.0.tar.gz (2) hadoop 配置 配置之前,需要在 cloud001 新增以下資料夾 /home/hduser/dfs/name /home/hduser/dfs/data /home/hduser/temp 修改相關設定擋案內容,清單如下 ~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh ~/hadoop-2.2.0/etc/hadoop/yarn-env.sh ~/hadoop-2.2.0/etc/hadoop/slaves ~/hadoop-2.2.0/etc/hadoop/core-site.xml ~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml ~/hadoop-2.2.0/etc/hadoop/mapred-site.xml (不存在,直接 rename mapred-site.xml.temp) ~/hadoop-2.2.0/etc/hadoop/yarn-site.xml 修改 hadoop-env.sh 修改 JAVA_HOME 值(export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45) 修改 yarn-env.sh 修改 JAVA_HOME 值(exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_45) 修改 slaves (這個文件裡面 KEEP 所有 slave 節點) 寫入以下內容: cloud002 12
  • 14. Hadoop 2.2.0 (multi-node) Installation on Ubuntu cloud003 修改 core-site.xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://cloud001:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hduser/temp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>hadoop.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hduser.groups</name> <value>*</value> </property> </configuration> 13
  • 15. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 修改 hdfs-site.xml <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>cloud001:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hduser/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hduser/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration> 14
  • 16. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 修改 mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>cloud001:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>cloud001:19888</value> </property> </configuration> 修改 yarn-site.xml <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 15
  • 17. Hadoop 2.2.0 (multi-node) Installation on Ubuntu <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>cloud001:8040</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>cloud001:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>cloud001:8025</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>cloud001:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>cloud001:8088</value> </property> </configuration> 設定環境變數 hduser@cloud001:~$ vim ~/.bashrc 16
  • 18. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 最後面貼上 export HADOOP_HOME=/home/hduser/hadoop-2.2.0 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" (3) clone imagecloud001 to cloud002 & cloud003 ,然後修改 hostname 17
  • 19. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 8. Hadoop 2.2 啟動 (1) 進入安裝目錄: cd ~/hadoop-2.2.0/,格式化 namenode ./bin/hdfs namenode –format 18
  • 20. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (2) 啟動 hdfs ./sbin/start-dfs.sh 此時在 001 上面運行的進程有:namenode secondarynamenode 002 和 003 上面運行的進程有:datanode (3) 啟動 yarn ./sbin/start-yarn.sh 此時在 001 上面運行的進程有:namenode secondarynamenoderesourcemanager 002 和 003 上面運行的進程有:datanode nodemanaget 19
  • 21. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 20
  • 22. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (4) 查看叢集狀態 ./bin/hdfs dfsadmin –report (5) 查看文件組成 ./bin/hdfs fsck / -files –blocks 21
  • 23. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (6) 查看 HDFS (7) 查看 RM 22
  • 24. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 23
  • 25. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 五、本文的引用網址: 1. http://blog.csdn.net/licongcong_0224/article/details/12972889 2. http://blog.csdn.net/focusheart/article/details/14005893(單機板) 3. http://dawndiy.com/archives/155/ (Linux 下安装配置 JDK7) 4. http://www.ithome.com.tw/itadm/article.php?c=73978&s=1 (Hadoop 簡介) 5. http://www.runpc.com.tw/content/cloud_content.aspx?id=105318 (Hadoop 簡介) 24