配置環(huán)境
成都創(chuàng)新互聯(lián)公司專注于惠來(lái)網(wǎng)站建設(shè)服務(wù)及定制,我們擁有豐富的企業(yè)做網(wǎng)站經(jīng)驗(yàn)。 熱誠(chéng)為您提供惠來(lái)營(yíng)銷型網(wǎng)站建設(shè),惠來(lái)網(wǎng)站制作、惠來(lái)網(wǎng)頁(yè)設(shè)計(jì)、惠來(lái)網(wǎng)站官網(wǎng)定制、微信小程序服務(wù),打造惠來(lái)網(wǎng)絡(luò)公司原創(chuàng)品牌,更為您提供惠來(lái)網(wǎng)站排名全網(wǎng)營(yíng)銷落地服務(wù)。
本文檔安裝hadoop集群環(huán)境,一個(gè)master作為namenode節(jié)點(diǎn),一個(gè)slave作為datanode節(jié)點(diǎn):
(1) master:
os: CentOS release 6.5 (Final)
ip: 172.16.101.58
user:root
hadoop-2.9.0.tar.gz
(2) slave:
os: CentOS release 6.5 (Final)
ip: 172.16.101.59
user:root
hadoop-2.9.0.tar.gz
前提條件
(1) master和slave都安裝好java環(huán)境,并配置好環(huán)境變量;
(2)master節(jié)點(diǎn)解壓好hadoop-2.9.0.tar.gz,并配置好環(huán)境變量;
(3)本篇文檔使用的是root用戶安裝,所以需要master上的root用戶可以ssh無(wú)密碼使用root用戶登錄slave節(jié)點(diǎn);
配置集群文件
在 master節(jié)點(diǎn)上執(zhí)行(本文檔先在master節(jié)點(diǎn)上配置文件,然后通過(guò)scp拷貝到其他slave節(jié)點(diǎn))
(1)slaves文件:將作為 DataNode 的主機(jī)名或者ip寫(xiě)入該文件,每行一個(gè),默認(rèn)為 localhost,所以在偽分布式配置時(shí),節(jié)點(diǎn)既作為 NameNode 也作為 DataNode。
[root@sht-sgmhadoopdn-01 hadoop]# cat slaves
172.16.101.59
(2)文件core-site.xml
[root@sht-sgmhadoopdn-01 hadoop]#cat /usr/local/hadoop-2.9.0/etc/hadoop/core-site.xml
(3)文件hdfs-site.xml
[root@sht-sgmhadoopdn-01 hadoop]# cat /usr/local/hadoop-2.9.0/etc/hadoop/hdfs-site.xml
(4)文件mapred-site.xml
[root@sht-sgmhadoopdn-01 hadoop]# cat /usr/local/hadoop-2.9.0/etc/hadoop/mapred-site.xml
(5)文件yarn-site.xml
[root@sht-sgmhadoopdn-01 hadoop]# cat /usr/local/hadoop-2.9.0/etc/yarn-site.xml
配置好后,將 Master上的 /usr/local/hadoop-2.9.0文件復(fù)制到各個(gè)節(jié)點(diǎn)上。因?yàn)橹坝信苓^(guò)偽分布式模式,建議在切換到集群模式前先刪除之前的臨時(shí)文件。
[root@sht-sgmhadoopdn-01 local]# rm -rf ./hadoop-2.9.0/tmp
[root@sht-sgmhadoopdn-01 local]# rm -rf ./hadoop-2.9.0/logs
[root@sht-sgmhadoopdn-01 local]# tar -zcf hadoop-2.9.0.master.tar.gz /usr/local/hadoop-2.9.0
[root@sht-sgmhadoopdn-01 local]# scp hadoop-2.9.0.master.tar.gz sht-sgmhadoopdn-02:/usr/local/
在 Slave節(jié)點(diǎn)上執(zhí)行
[root@sht-sgmhadoopdn-02 local]# tar -zxf hadoop-2.9.0.master.tar.gz
啟動(dòng)hadoop集群
在 master節(jié)點(diǎn)上執(zhí)行:
#第一次啟動(dòng)需要格式化HDFS,以后再啟動(dòng)不需要
[root@sht-sgmhadoopdn-01 hadoop-2.9.0]# hdfs namenode -format
[root@sht-sgmhadoopdn-01 hadoop-2.9.0]# start-dfs.sh
[root@sht-sgmhadoopdn-01 hadoop-2.9.0]# start-yarn.sh
[root@sht-sgmhadoopdn-01 hadoop-2.9.0]# mr-jobhistory-daemon.sh start historyserver
[root@sht-sgmhadoopdn-01 hadoop-2.9.0]# jps
20289 JobHistoryServer
19730 ResourceManager
18934 NameNode
19163 SecondaryNameNode
20366 Jps
在 Slave節(jié)點(diǎn)上執(zhí)行:
[root@sht-sgmhadoopdn-02 hadoop]# jps
32147 DataNode
535 Jps
32559 NodeManager
在 master節(jié)點(diǎn)上執(zhí)行:
[root@sht-sgmhadoopdn-01 hadoop]# hdfs dfsadmin -report
Configured Capacity: 75831140352 (70.62 GB)
Present Capacity: 21246287872 (19.79 GB)
DFS Remaining: 21246263296 (19.79 GB)
DFS Used: 24576 (24 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (1): #存活的slave數(shù)量
Name: 172.16.101.59:50010 (sht-sgmhadoopdn-02)
Hostname: sht-sgmhadoopdn-02
Decommission Status : Normal
Configured Capacity: 75831140352 (70.62 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 50732867584 (47.25 GB)
DFS Remaining: 21246263296 (19.79 GB)
DFS Used%: 0.00%
DFS Remaining%: 28.02%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Dec 27 11:08:46 CST 2017
Last Block Report: Wed Dec 27 11:02:01 CST 2017
Console管理平臺(tái)
NameNodehttp://172.16.101.58:50070
執(zhí)行分布式實(shí)例MapReduce Job
[root@sht-sgmhadoopdn-01 hadoop]# hdfs dfs -mkdir -p /user/root/input
[root@sht-sgmhadoopdn-01 hadoop]# hdfs dfs -put /usr/local/hadoop-2.9.0/etc/hadoop/*.xml input
[root@sht-sgmhadoopdn-01 hadoop]# hadoop jar /usr/local/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar grep input output 'dfs[a-z.]+'
17/12/27 11:25:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/12/27 11:25:34 INFO client.RMProxy: Connecting to ResourceManager at /172.16.101.58:8032
17/12/27 11:25:36 INFO input.FileInputFormat: Total input files to process : 9
17/12/27 11:25:36 INFO mapreduce.JobSubmitter: number of splits:9
17/12/27 11:25:37 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
17/12/27 11:25:37 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1514343869308_0001
17/12/27 11:25:38 INFO impl.YarnClientImpl: Submitted application application_1514343869308_0001
17/12/27 11:25:38 INFO mapreduce.Job: The url to track the job:http://sht-sgmhadoopdn-01:8088/proxy/application_1514343869308_0001/
17/12/27 11:25:38 INFO mapreduce.Job: Running job: job_1514343869308_0001
17/12/27 11:25:51 INFO mapreduce.Job: Job job_1514343869308_0001 running in uber mode : false
17/12/27 11:25:51 INFO mapreduce.Job: map 0% reduce 0%
17/12/27 11:26:14 INFO mapreduce.Job: map 11% reduce 0%
17/12/27 11:26:15 INFO mapreduce.Job: map 67% reduce 0%
17/12/27 11:26:29 INFO mapreduce.Job: map 100% reduce 0%
17/12/27 11:26:32 INFO mapreduce.Job: map 100% reduce 100%
17/12/27 11:26:34 INFO mapreduce.Job: Job job_1514343869308_0001 completed successfully
17/12/27 11:26:34 INFO mapreduce.Job: Counters: 50
......
[root@sht-sgmhadoopdn-01 hadoop]# hdfs dfs -cat output/*
17/12/27 11:30:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1 dfsadmin
1 dfs.replication
1 dfs.namenode.secondary.http
1 dfs.namenode.name.dir
1 dfs.datanode.data.dir
也可以通過(guò)瀏覽器訪問(wèn)console,查看詳細(xì)的分析信息:
ResourceManager -http://172.16.101.58:8088
停止hadoop集群
在 master節(jié)點(diǎn)上執(zhí)行:
[root@sht-sgmhadoopdn-01 hadoop]#stop-yarn.sh
[root@sht-sgmhadoopdn-01 hadoop]#stop-dfs.sh
[root@sht-sgmhadoopdn-01 hadoop]#mr-jobhistory-daemon.sh stop historyserver
參考鏈接:
http://www.powerxing.com/install-hadoop-cluster/
http://hadoop.apache.org/docs/r2.9.0/hadoop-project-dist/hadoop-common/ClusterSetup.html