我是放在opt/目录下的
Hadoop安装包介绍
第1步:从宿主机中将hadoop的安装包上传到虚拟机上
第2步:解压安装包,并重命名
第3步:配置Hadoop环境变量,并source生效
第4步:测试
测试2: 运行mapreduce示例程序grep
目的→
统计源目录下所有资源(每个xml文件)中的内容以dfs开头的行数步骤:
①准备测试的数据:
[root@master ~]# cd /opt/hadoop/etc/hadoop/
[root@master hadoop]# mkdir ~/input
[root@master hadoop]# cp *.xml ~/input
[root@master hadoop]# ll ~/input
总用量 48
-rw-r--r-- 1 root root 4436 9月 21 14:23 capacity-scheduler.xml
-rw-r--r-- 1 root root 774 9月 21 14:23 core-site.xml
-rw-r--r-- 1 root root 9683 9月 21 14:23 hadoop-policy.xml
-rw-r--r-- 1 root root 775 9月 21 14:23 hdfs-site.xml
-rw-r--r-- 1 root root 620 9月 21 14:23 httpfs-site.xml
-rw-r--r-- 1 root root 3518 9月 21 14:23 kms-acls.xml
-rw-r--r-- 1 root root 5540 9月 21 14:23 kms-site.xml
-rw-r--r-- 1 root root 690 9月 21 14:23 yarn-site.xml②调用内置的示例程序(mapreduce):
[root@master input]# hadoop jar /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.6.jar grep ~/input ~/output 'dfs[a-z.]'
20/09/21 14:30:41 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
20/09/21 14:30:41 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
20/09/21 14:30:41 INFO input.FileInputFormat: Total input paths to process : 8
20/09/21 14:30:41 INFO mapreduce.JobSubmitter: number of splits:8
20/09/21 14:30:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1843484991_0001
20/09/21 14:30:41 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
20/09/21 14:30:41 INFO mapreduce.Job: Running job: job_local1843484991_0001
20/09/21 14:30:41 INFO mapred.LocalJobRunner: OutputCommitter set in config null
20/09/21 14:30:41 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
20/09/21 14:30:41 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
20/09/21 14:30:41 INFO mapred.LocalJobRunner: Waiting for map tasks
20/09/21 14:30:41 INFO mapred.LocalJobRunner: Starting task: attempt_local1843484991_0001_m_000000_0
20/09/21 14:30:41 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
20/09/21 14:30:41 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
20/09/21 14:30:41 INFO mapred.MapTask: Processing split: file:/root/input/hadoop-policy.xml:0+9683
20/09/21 14:30:41 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
20/09/21 14:30:41 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
20/09/21 14:30:41 INFO mapred.MapTask: soft limit at 83886080
20/09/21 14:30:41 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
20/09/21 14:30:41 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
20/09/21 14:30:41 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
20/09/21 14:30:41 INFO mapred.LocalJobRunner:
20/09/21 14:30:41 INFO mapred.MapTask: Starting flush of map output
20/09/21 14:30:41 INFO mapred.MapTask: Spilling map output
20/09/21 14:30:41 INFO mapred.MapTask: bufstart = 0; bufend = 13; bufvoid = 104857600
20/09/21 14:30:41 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600
20/09/21 14:30:42 INFO mapred.MapTask: Finished spill 0
20/09/21 14:30:42 INFO mapred.Task: Task:attempt_local1843484991_0001_m_000000_0 is done. And is in the process of committing
20/09/21 14:30:42 INFO mapred.LocalJobRunner: map
20/09/21 14:30:42 INFO mapred.Task: Task 'attempt_local1843484991_0001_m_000000_0' done.
20/09/21 14:30:42 INFO mapred.Task: Final Counters for attempt_local1843484991_0001_m_000000_0: Counters: 18
File System Counters
FILE: Number of bytes read=306465
FILE: Number of bytes written=590739
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=226
Map output records=1
Map output bytes=13
Map output materialized bytes=21
Input split bytes=99
Combine input records=1
Combine output records=1
Spilled Records=1
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0