作者:【吴业亮】云计算开发工程师
博客:http://blog.csdn.net/wylfengyujiancheng
一、安装jdk (各个节点均操作)
1、环境准备
1) master.wyl.world (Master Node)
2) node01.wyl.world (Slave Node)
3) node02.wyl.world (Slave Node)
2、下载jdk包
[root@master ~]# curl -LO -H "Cookie: oraclelicense=accept-securebackup-cookie" \
"http://download.oracle.com/otn-pub/java/jdk/8u71-b15/jdk-8u71-linux-x64.rpm"
安装jdk
[root@master ~]# rpm -Uvh jdk-8u71-linux-x64.rpm
Preparing... ############################## [100%]
1:jdk1.8.0_71 ############################## [100%]
Unpacking JAR files...
rt.jar...
jsse.jar...
charsets.jar...
tools.jar...
localedata.jar...
jfxrt.jar...
3、更改环境变量
[root@master ~]# vi /etc/profile
# 加在末尾
export JAVA_HOME=/usr/java/default
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
4、应用环境变量
[root@master ~]# source /etc/profile
5、如果系统之前安装过其他版本的jdk,需要更改默认配置
[root@master ~]# alternatives --config java
There are 2 programs which provide 'java'.
Selection Command
-----------------------------------------------
*+ 1 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64/jre/bin/java
2 /usr/java/jdk1.8.0_71/jre/bin/java
选择最新的
Enter to keep the current selection[+], or type selection number: 2
6、写入一个测试程序
[root@master ~]# vi day.java
import java.util.Calendar;
class day {
public static void main(String[] args) {
Calendar cal = Calendar.getInstance();
int year = cal.get(Calendar.YEAR);
int month = cal.get(Calendar.MONTH) + 1;
int day = cal.get(Calendar.DATE);
int hour = cal.get(Calendar.HOUR_OF_DAY);
int minute = cal.get(Calendar.MINUTE);
System.out.println(year + "/" + month + "/" + day + " " + hour + ":" + minute);
}
}
7、编译
[root@master ~]# javac day.java
8、执行
[root@master ~]# java day
2015/3/16 20:30
二、安装hadoop
1、在各个节点上创建用户,并设置密码
[root@master ~]# useradd -d /usr/hadoop hadoop
[root@master ~]# chmod 755 /usr/hadoop
[root@master ~]# passwd hadoop
Changing password for user hadoop.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
2、通过hadoop用户登录到master节点上,生成秘钥,并拷贝到其他节点上
生成秘钥
[hadoop@master ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/usr/hadoop/.ssh/id_rsa):
Created directory '/usr/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /usr/hadoop/.ssh/id_rsa.
Your public key has been saved in /usr/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx hadoop@master.wyl.world
The key's randomart image is:
3、发送到本机
[hadoop@master ~]$ ssh-copy-id localhost
4、分别拷贝到node节点
[hadoop@master ~]$ ssh-copy-id node01.wyl.world
[hadoop@master ~]$ ssh-copy-id node02.wyl.world
5、通过hadoop用户在各个节点上安装hadoop
可以通过下面路径下载最新的安装代码
https://hadoop.apache.org/releases.html
下载安装包
[hadoop@master ~]$ curl -O http://ftp.jaist.ac.jp/pub/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
解压安装包
[hadoop@master ~]$ tar zxvf hadoop-2.7.3.tar.gz -C /usr/hadoop --strip-components 1
写入系统变量
[hadoop@master ~]$ vi ~/.bash_profile
# 加在末尾
export HADOOP_HOME=/usr/hadoop
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
应用系统变量
[hadoop@master ~]$ source ~/.bash_profile
6、通过hadoop用户在master节点上配置hadoop
创建目录
[hadoop@master ~]$ mkdir ~/datanode
[hadoop@master ~]$ ssh node01.wyl.world "mkdir ~/datanode"
[hadoop@master ~]$ ssh node02.wyl.world "mkdir ~/datanode"
7、修改~/etc/hadoop/hdfs-site.xml
在 <configuration> - </configuration> 之间加入如下内容
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/hadoop/datanode</value>
</property>
</configuration>
8、拷贝到其他节点上
[hadoop@master ~]$ scp ~/etc/hadoop/hdfs-site.xml node01.wyl.world:~/etc/hadoop/
[hadoop@master ~]$ scp ~/etc/hadoop/hdfs-site.xml node02.wyl.world:~/etc/hadoop/
9、修改~/etc/hadoop/core-site.xml
在 <configuration> - </configuration> 之间加入如下内容
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master.wyl.world:9000/</value>
</property>
</configuration>
10、拷贝到其他节点上
[hadoop@master ~]$ scp ~/etc/hadoop/core-site.xml node01.wyl.world:~/etc/hadoop/
[hadoop@master ~]$ scp ~/etc/hadoop/core-site.xml node02.wyl.world:~/etc/hadoop/
[hadoop@master ~]$ sed -i -e 's/\${JAVA_HOME}/\/usr\/java\/default/' ~/etc/hadoop/hadoop-env.sh
[hadoop@master ~]$ scp ~/etc/hadoop/hadoop-env.sh node01.wyl.world:~/etc/hadoop/
[hadoop@master ~]$ scp ~/etc/hadoop/hadoop-env.sh node02.wyl.world:~/etc/hadoop/
[hadoop@master ~]$ mkdir ~/namenode
11、修改~/etc/hadoop/hdfs-site.xml
在 <configuration> - </configuration> 之间加入如下内容
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/hadoop/namenode</value>
</property>
</configuration>
12、创建~/etc/hadoop/hdfs-site.xml并写入
# create new
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
13、配置~/etc/hadoop/yarn-site.xml
在 <configuration> - </configuration> 之间新增如下内容
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master.wyl.world</value>
</property>
<property>
<name>yarn.nodemanager.hostname</name>
<value>master.wyl.world</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
14、在~/etc/hadoop/slaves写入各个节点信息
#添加所有节点信息,并删除localhost
master.wyl.world
node01.wyl.world
node02.wyl.world
15、格式化namenode并启动hadoop服务
格式化节点
[hadoop@master ~]$ hdfs namenode -format
15/07/28 19:58:14 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master.wyl.world/10.0.0.30
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.3
.....
.....
15/07/28 19:58:17 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master.wyl.world/10.0.0.30
************************************************************/
启动dfs
[hadoop@master ~]$ start-dfs.sh
Starting namenodes on [master.wyl.world]
master.wyl.world: starting namenode, logging to /usr/hadoop/logs/hadoop-hadoop-namenode-master.wyl.world.out
master.wyl.world: starting datanode, logging to /usr/hadoop/logs/hadoop-hadoop-datanode-master.wyl.world.out
node02.wyl.world: starting datanode, logging to /usr/hadoop/logs/hadoop-hadoop-datanode-node02.wyl.world.out
node01.wyl.world: starting datanode, logging to /usr/hadoop/logs/hadoop-hadoop-datanode-node01.wyl.world.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/hadoop/logs/hadoop-hadoop-secondarynamenode-master.wyl.world.out
启动yarn
[hadoop@master ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/hadoop/logs/yarn-hadoop-resourcemanager-master.wyl.world.out
master.wyl.world: starting nodemanager, logging to /usr/hadoop/logs/yarn-hadoop-nodemanager-master.wyl.world.out
node02.wyl.world: starting nodemanager, logging to /usr/hadoop/logs/yarn-hadoop-nodemanager-node02.wyl.world.out
node01.wyl.world: starting nodemanager, logging to /usr/hadoop/logs/yarn-hadoop-nodemanager-node01.wyl.world.out
16、查看服务状态,正常如下,如异常,请返回检查配置
[hadoop@master ~]$ jps
2130 NameNode
2437 SecondaryNameNode
2598 ResourceManager
2710 NodeManager
3001 Jps
2267 DataNode
17、创建目录
[hadoop@master ~]$ hdfs dfs -mkdir /test
18、拷贝一个文件到/test
[hadoop@master ~]$ hdfs dfs -copyFromLocal ~/NOTICE.txt /test
19、展示文件内容
[hadoop@master ~]$ hdfs dfs -cat /test/NOTICE.txt
This product includes software developed by The Apache Software
Foundation (http://www.apache.org/).
20、执行程序
[hadoop@master ~]$ hadoop jar ~/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /test/NOTICE.txt /output01
15/07/28 19:28:47 INFO client.RMProxy: Connecting to ResourceManager at master.wyl.world/10.0.0.30:8032
15/07/28 19:28:48 INFO input.FileInputFormat: Total input paths to process : 1
15/07/28 19:28:48 INFO mapreduce.JobSubmitter: number of splits:1
.....
.....
21、查看结果
[hadoop@master ~]$ hdfs dfs -ls /output01
Found 2 items
-rw-r--r-- 2 hadoop supergroup 0 2015-07-29 14:29 /output01/_SUCCESS
-rw-r--r-- 2 hadoop supergroup 123 2015-07-29 14:29 /output01/part-r-00000
22、显示文件结果
[hadoop@master ~]$ hdfs dfs -cat /output01/part-r-00000
(http://www.apache.org/). 1
Apache 1
Foundation 1
Software 1
The 1
This 1
by 1
developed 1
includes 1
product 1
software 1
查看集群概要
http://(server’s hostname or IP address):50070
集群详细信息
http://(server’s hostname or IP address):8088/