本篇內(nèi)容介紹了“Hadoop2 namenode HA+聯(lián)邦+Resource Manager HA實(shí)驗(yàn)分析”的有關(guān)知識(shí),在實(shí)際案例的操作過(guò)程中,不少人都會(huì)遇到這樣的困境,接下來(lái)就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧!希望大家仔細(xì)閱讀,能夠?qū)W有所成!
創(chuàng)新互聯(lián)建站是一家專業(yè)從事網(wǎng)站建設(shè)、網(wǎng)站制作的網(wǎng)絡(luò)公司。作為專業(yè)網(wǎng)站建設(shè)公司,創(chuàng)新互聯(lián)建站依托的技術(shù)實(shí)力、以及多年的網(wǎng)站運(yùn)營(yíng)經(jīng)驗(yàn),為您提供專業(yè)的成都網(wǎng)站建設(shè)、營(yíng)銷(xiāo)型網(wǎng)站建設(shè)及網(wǎng)站設(shè)計(jì)開(kāi)發(fā)服務(wù)!
實(shí)驗(yàn)的Hadoop版本為2.5.2,硬件環(huán)境是5臺(tái)虛擬機(jī),使用的均是CentOS6.6操作系統(tǒng),虛擬機(jī)IP和hostname分別為:
192.168.63.171 node1.zhch
192.168.63.172 node2.zhch
192.168.63.173 node3.zhch
192.168.63.174 node4.zhch
192.168.63.175 node5.zhch
ssh免密碼、防火墻、JDK這里就不在贅述了。虛擬機(jī)的角色分配是:
node1為主namenode1、主resource manager、zookeeper、journalnode
node2為備namendoe1、zookeeper、journalnode
node3為主namenode2、備resource manager、zookeeper、journalnode、datanode
node4為備namenode2、datanode
node5為datanode
步驟和
Namenode HA的安裝配置基本相同,需要先
安裝zookeeper集群,主要的不同在于core-site.xml、hdfs-site.xml、yarn-site.xml配置文件,其余文件的配置和Namenode HA安裝配置基本一致。
一、配置Hadoop
## 解壓 [yyl@node1 program]$ tar -zxf hadoop-2.5.2.tar.gz ## 創(chuàng)建文件夾 [yyl@node1 program]$ mkdir hadoop-2.5.2/name [yyl@node1 program]$ mkdir hadoop-2.5.2/data [yyl@node1 program]$ mkdir hadoop-2.5.2/journal [yyl@node1 program]$ mkdir hadoop-2.5.2/tmp ## 配置hadoop-env.sh [yyl@node1 program]$ cd hadoop-2.5.2/etc/hadoop/ [yyl@node1 hadoop]$ vim hadoop-env.sh export JAVA_HOME=/usr/lib/java/jdk1.7.0_80 ## 配置yarn-env.sh [yyl@node1 hadoop]$ vim yarn-env.sh export JAVA_HOME=/usr/lib/java/jdk1.7.0_80 ## 配置slaves [yyl@node1 hadoop]$ vim slaves node3.zhch node4.zhch node5.zhch ## 配置mapred-site.xml [yyl@node1 hadoop]$ cp mapred-site.xml.template mapred-site.xml [yyl@node1 hadoop]$ vim mapred-site.xml## 配置core-site.xml [yyl@node1 hadoop]$ vim core-site.xml mapreduce.framework.name yarn mapreduce.jobhistory.address node2.zhch:10020 mapreduce.jobhistory.webapp.address node2.zhch:19888 ## 配置hdfs-site.xml [yyl@node1 hadoop]$ vim hdfs-site.xml fs.defaultFS hdfs://mycluster io.file.buffer.size 131072 hadoop.tmp.dir file:/home/yyl/program/hadoop-2.5.2/tmp hadoop.proxyuser.hduser.hosts * hadoop.proxyuser.hduser.groups * ha.zookeeper.quorum node1.zhch:2181,node2.zhch:2181,node3.zhch:2181 ha.zookeeper.session-timeout.ms 1000 ## 配置yarn-site.xml [yyl@node1 hadoop]$ vim yarn-site.xml dfs.namenode.name.dir file:/home/yyl/program/hadoop-2.5.2/name dfs.datanode.data.dir file:/home/yyl/program/hadoop-2.5.2/data dfs.replication 1 dfs.webhdfs.enabled true dfs.permissions false dfs.permissions.enabled false dfs.nameservices mycluster,yourcluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 node1.zhch:9000 dfs.namenode.rpc-address.mycluster.nn2 node2.zhch:9000 dfs.namenode.servicerpc-address.mycluster.nn1 node1.zhch:53310 dfs.namenode.servicerpc-address.mycluster.nn2 node2.zhch:53310 dfs.namenode.http-address.mycluster.nn1 node1.zhch:50070 dfs.namenode.http-address.mycluster.nn2 node2.zhch:50070 dfs.ha.namenodes.yourcluster nn1,nn2 dfs.namenode.rpc-address.yourcluster.nn1 node3.zhch:9000 dfs.namenode.rpc-address.yourcluster.nn2 node4.zhch:9000 dfs.namenode.servicerpc-address.yourcluster.nn1 node3.zhch:53310 dfs.namenode.servicerpc-address.yourcluster.nn2 node4.zhch:53310 dfs.namenode.http-address.yourcluster.nn1 node3.zhch:50070 dfs.namenode.http-address.yourcluster.nn2 node4.zhch:50070 dfs.namenode.shared.edits.dir qjournal://node1.zhch:8485;node2.zhch:8485;node3.zhch:8485/mycluster dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.client.failover.proxy.provider.yourcluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /home/yyl/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000 dfs.journalnode.edits.dir /home/yyl/program/hadoop-2.5.2/journal dfs.ha.automatic-failover.enabled.mycluster true dfs.ha.automatic-failover.enabled.yourcluster true ha.failover-controller.cli-check.rpc-timeout.ms 60000 ipc.client.connect.timeout 60000 dfs.image.transfer.bandwidthPerSec 4194304 ## 分發(fā)到各個(gè)節(jié)點(diǎn) [yyl@node1 hadoop]$ cd /home/yyl/program/ [yyl@node1 program]$ scp -rp hadoop-2.5.2 yyl@node2.zhch:/home/yyl/program/ [yyl@node1 program]$ scp -rp hadoop-2.5.2 yyl@node3.zhch:/home/yyl/program/ [yyl@node1 program]$ scp -rp hadoop-2.5.2 yyl@node4.zhch:/home/yyl/program/ [yyl@node1 program]$ scp -rp hadoop-2.5.2 yyl@node5.zhch:/home/yyl/program/ ## 修改主namenode2(node3.zhch)和備namenode2(node4.zhch)的 hdfs-site.xml 配置文件中 dfs.namenode.shared.edits.dir 的值為 qjournal://node1.zhch:8485;node2.zhch:8485;node3.zhch:8485/yourcluster ,其余屬性值不變。 ## 修改備resource manager(node3.zhch)的 yarn-site.xml 配置文件中 yarn.resourcemanager.ha.id 的值為 rm2 ,其余屬性值不變。 ## 在各個(gè)節(jié)點(diǎn)上設(shè)置hadoop環(huán)境變量 [yyl@node1 ~]$ vim .bash_profile export HADOOP_PREFIX=/home/yyl/program/hadoop-2.5.2 export HADOOP_COMMON_HOME=$HADOOP_PREFIX export HADOOP_HDFS_HOME=$HADOOP_PREFIX export HADOOP_MAPRED_HOME=$HADOOP_PREFIX export HADOOP_YARN_HOME=$HADOOP_PREFIX export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.resourcemanager.connect.retry-interval.ms 2000 yarn.resourcemanager.ha.enabled true yarn.resourcemanager.ha.automatic-failover.enabled true yarn.resourcemanager.ha.automatic-failover.embedded true yarn.resourcemanager.cluster-id yarn-cluster yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.ha.id rm1 yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler yarn.resourcemanager.recovery.enabled true yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms 5000 yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore yarn.resourcemanager.zk-address node1.zhch:2181,node2.zhch:2181,node3.zhch:2181 yarn.resourcemanager.zk.state-store.address node1.zhch:2181,node2.zhch:2181,node3.zhch:2181 yarn.resourcemanager.address.rm1 node1.zhch:23140 yarn.resourcemanager.address.rm2 node3.zhch:23140 yarn.resourcemanager.scheduler.address.rm1 node1.zhch:23130 yarn.resourcemanager.scheduler.address.rm2 node3.zhch:23130 yarn.resourcemanager.admin.address.rm1 node1.zhch:23141 yarn.resourcemanager.admin.address.rm2 node3.zhch:23141 yarn.resourcemanager.resource-tracker.address.rm1 node1.zhch:23125 yarn.resourcemanager.resource-tracker.address.rm2 node3.zhch:23125 yarn.resourcemanager.webapp.address.rm1 node1.zhch:23188 yarn.resourcemanager.webapp.address.rm2 node3.zhch:23188 yarn.resourcemanager.webapp.https.address.rm1 node1.zhch:23189 yarn.resourcemanager.webapp.https.address.rm2 node3.zhch:23189
二、格式化與啟動(dòng)
## 啟動(dòng)Zookeeper集群 ## 在主namenode1(node1.zhch)、主namenode2(node3.zhch)上執(zhí)行命令: $HADOOP_HOME/bin/hdfs zkfc -formatZK [yyl@node1 ~]$ hdfs zkfc -formatZK [yyl@node3 ~]$ hdfs zkfc -formatZK [yyl@node2 ~]$ zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / [hadoop-ha, zookeeper] [zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha [mycluster, yourcluster] ## 在node1.zhch node2.zhch node3.zhch上啟動(dòng)journalnode: [yyl@node1 ~]$ hadoop-daemon.sh start journalnode starting journalnode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-journalnode-node1.zhch.out [yyl@node1 ~]$ jps 1985 QuorumPeerMain 2222 Jps 2176 JournalNode [yyl@node2 ~]$ hadoop-daemon.sh start journalnode starting journalnode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-journalnode-node2.zhch.out [yyl@node2 ~]$ jps 1783 Jps 1737 JournalNode 1638 QuorumPeerMain [yyl@node3 ~]$ hadoop-daemon.sh start journalnode starting journalnode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-journalnode-node3.zhch.out [yyl@node3 ~]$ jps 1658 JournalNode 1495 QuorumPeerMain 1704 Jps ## 在主namenode1(node1.zhch)上格式化namenode [yyl@node1 ~]$ hdfs namenode -format -clusterId c1 ## 在主namenode1(node1.zhch)上啟動(dòng)namenode進(jìn)程 [yyl@node1 ~]$ hadoop-daemon.sh start namenode starting namenode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-namenode-node1.zhch.out [yyl@node1 ~]$ jps 2286 NameNode 1985 QuorumPeerMain 2369 Jps 2176 JournalNode ## 在備namenode1(node2.zhch)上同步元數(shù)據(jù) [yyl@node2 ~]$ hdfs namenode -bootstrapStandby ## 在備namenode1(node2.zhch)上啟動(dòng)namenode進(jìn)程 [yyl@node2 ~]$ hadoop-daemon.sh start namenode starting namenode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-namenode-node2.zhch.out [yyl@node2 ~]$ jps 1923 Jps 1737 JournalNode 1638 QuorumPeerMain 1840 NameNode ## 在主namenode2(node3.zhch)上格式化namenode [yyl@node3 ~]$ hdfs namenode -format -clusterId c1 ## 在主namenode2(node3.zhch)上啟動(dòng)namenode進(jìn)程 [yyl@node3 ~]$ hadoop-daemon.sh start namenode starting namenode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-namenode-node3.zhch.out [yyl@node3 ~]$ jps 1658 JournalNode 1495 QuorumPeerMain 1767 NameNode 1850 Jps ## 在備namenode2(node4.zhch)上同步元數(shù)據(jù) [yyl@node4 ~]$ hdfs namenode -bootstrapStandby ## 在備namenode2(node4.zhch)上啟動(dòng)namenode進(jìn)程 [yyl@node4 ~]$ hadoop-daemon.sh start namenode starting namenode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-namenode-node4.zhch.out [yyl@node4 ~]$ jps 1602 Jps 1519 NameNode ## 在所有的namenode上啟動(dòng)ZooKeeperFailoverController [yyl@node1 ~]$ hadoop-daemon.sh start zkfc starting zkfc, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-zkfc-node1.zhch.out [yyl@node2 ~]$ hadoop-daemon.sh start zkfc starting zkfc, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-zkfc-node2.zhch.out [yyl@node3 ~]$ hadoop-daemon.sh start zkfc starting zkfc, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-zkfc-node3.zhch.out [yyl@node4 ~]$ hadoop-daemon.sh start zkfc starting zkfc, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-zkfc-node4.zhch.out ## 啟動(dòng)DataNode [yyl@node1 ~]$ hadoop-daemons.sh start datanode node4.zhch: starting datanode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-datanode-node4.zhch.out node5.zhch: starting datanode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-datanode-node5.zhch.out node3.zhch: starting datanode, logging to /home/yyl/program/hadoop-2.5.2/logs/hadoop-yyl-datanode-node3.zhch.out ## 啟動(dòng)Yarn [yyl@node1 ~]$ start-yarn.sh starting yarn daemons starting resourcemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-resourcemanager-node1.zhch.out node3.zhch: starting nodemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-nodemanager-node3.zhch.out node4.zhch: starting nodemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-nodemanager-node4.zhch.out node5.zhch: starting nodemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-nodemanager-node5.zhch.out ## 在備resource manager(node3.zhch)上啟動(dòng)resource manager [yyl@node3 ~]$ yarn-daemon.sh start resourcemanager starting resourcemanager, logging to /home/yyl/program/hadoop-2.5.2/logs/yarn-yyl-resourcemanager-node3.zhch.out ## 查看resource manager狀態(tài) [yyl@node1 ~]$ yarn rmadmin -getServiceState rm1 active [yyl@node1 ~]$ yarn rmadmin -getServiceState rm2 standby
三、驗(yàn)證
開(kāi)兩個(gè)終端,都連接到主resource manager,在終端A中運(yùn)行jps命令查看resource manager進(jìn)程ID,在終端B中運(yùn)行MapReduce程序;然后再到終端A中kill掉resource manager進(jìn)程;最后觀察在主resource manager進(jìn)程掛掉后,MapReduce任務(wù)是否還能正常執(zhí)行完畢。
“Hadoop2 namenode HA+聯(lián)邦+Resource Manager HA實(shí)驗(yàn)分析”的內(nèi)容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí)可以關(guān)注創(chuàng)新互聯(lián)網(wǎng)站,小編將為大家輸出更多高質(zhì)量的實(shí)用文章!