這篇文章主要介紹“Hadoop2.7.5+Spark2.2.1分布式集群怎么搭建”,在日常操作中,相信很多人在Hadoop2.7.5+Spark2.2.1分布式集群怎么搭建問(wèn)題上存在疑惑,小編查閱了各式資料,整理出簡(jiǎn)單好用的操作方法,希望對(duì)大家解答”Hadoop2.7.5+Spark2.2.1分布式集群怎么搭建”的疑惑有所幫助!接下來(lái),請(qǐng)跟著小編一起來(lái)學(xué)習(xí)吧!
創(chuàng)新互聯(lián)建站專注于河北網(wǎng)站建設(shè)服務(wù)及定制,我們擁有豐富的企業(yè)做網(wǎng)站經(jīng)驗(yàn)。 熱誠(chéng)為您提供河北營(yíng)銷型網(wǎng)站建設(shè),河北網(wǎng)站制作、河北網(wǎng)頁(yè)設(shè)計(jì)、河北網(wǎng)站官網(wǎng)定制、成都微信小程序服務(wù),打造河北網(wǎng)絡(luò)公司原創(chuàng)品牌,更為您提供河北網(wǎng)站排名全網(wǎng)營(yíng)銷落地服務(wù)。
一、運(yùn)行環(huán)境
CentOS 6.5
Spark 2.2.1
Hadoop 2.7.5
Java JDK 1.8
Scala 2.12.5
二、節(jié)點(diǎn)IP及角色對(duì)應(yīng)關(guān)系
節(jié)點(diǎn)名 | IP | Spark角色 | hadoop角色 |
hyw-spark-1 | 10.39.60.221 | master、worker | master |
hyw-spark-2 | 10.39.60.222 | worker | slave |
hyw-spark-3 | 10.39.60.223 | worker | slave |
三、基礎(chǔ)環(huán)境配置
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
四、jdk安裝(在hadoop用戶下執(zhí)行)
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
五、scala安裝(在hadoop用戶下執(zhí)行)
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
六、hadoop集群安裝(在hadoop用戶下執(zhí)行)
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
6.4.4、$vim hdfs-site.xml
將文件末尾修改為
6.4.5、$vim mapred-site.xml
將文件末尾 修改為
6.4.6、$vim yarn-site.xml
將文件末尾修改為
6.4.7、$vim slaves
添加如下內(nèi)容
hyw-spark-1
hyw-spark-2
hyw-spark-3
6.4.8、拷貝文件到slave節(jié)點(diǎn)(總共7個(gè)文件)
$scp hadoop-env.sh yarn-env.sh core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml slave hadoop@hyw-spark-2:/usr/local/spark/etc/spark/
$scp hadoop-env.sh yarn-env.sh core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml slave hadoop@hyw-spark-3:/usr/local/spark/etc/spark/
6.5、啟動(dòng)hadoop集群
6.5.1、格式化NameNode
在Master節(jié)點(diǎn)上,執(zhí)行如下命令
$hdfs namenode -format
成功的話,會(huì)看到 “successfully formatted” 和 “Exitting with status 0” 的提示,若為 “Exitting with status 1” 則是出錯(cuò)。
6.5.2、啟動(dòng)HDFS(NameNode、DataNode)
在Master節(jié)點(diǎn)上,執(zhí)行如下命令
$start-dfs.sh
使用jps命令在Master上可以看到如下進(jìn)程:
8757 SecondaryNameNode
7862 DataNode
7723 NameNode
8939 Jps
使用jps命令在兩個(gè)Slave上可以看到如下進(jìn)程:
7556 Jps
7486 DataNode
6.5.3啟動(dòng)Yarn(ResourceManager 、NodeManager)
在Master節(jié)點(diǎn)上,執(zhí)行如下命令
$start-yarn.sh
使用jps命令在Master上可以看到如下進(jìn)程:
9410 Jps
8757 SecondaryNameNode
8997 ResourceManager
7862 DataNode
9112 NodeManager
7723 NameNode
使用jps命令在兩個(gè)Slave上可以看到如下進(jìn)程:
7718 Jps
7607 NodeManager
7486 DataNode
6.5.4通過(guò)瀏覽器查看HDFS信息
瀏覽器訪問(wèn)http://10.39.60.221:50070,出現(xiàn)如下界面
七、spark安裝(在hadoop用戶下執(zhí)行)
7.1、下載文件到/opt目錄下,解壓文件到/usr/local
$cd /opt
$sudo tar -xzvf spark-2.2.1-bin-hadoop2.7.tgz -C /usr/local
$cd /usr/local
$sudo mv spark-2.2.1-bin-hadoop2.7/ spark
$sudo chown -R hadoop:hadoop spark
7.2、設(shè)置環(huán)境變量
$sudo vi /etc/profile
添加如下內(nèi)容
export SPARK_HOME=/usr/local/spark
PATH=$JAVA_HOME/bin:$PATH:$HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin
更新環(huán)境變量
source /etc/profile
7.3、配置文件修改
以下操作均在master節(jié)點(diǎn)配置,配置完成后scp到slave節(jié)點(diǎn)
$cd /usr/local/spark/conf
7.3.1、$cp spark-env.sh.template spark-env.sh
$vim spark-env.sh
添加如下內(nèi)容
export JAVA_HOME=/opt/jdk1.8
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
export SCALA_HOME=/usr/local/scala
export SPARK_MASTER_IP=10.39.60.221
export SPARK_WORKER_MEMORY=1g
7.3.2、$cp slaves.template slaves
$vim slaves
添加如下內(nèi)容
hyw-spark-1
hyw-spark-2
hyw-spark-3
7.3.3拷貝文件到slave節(jié)點(diǎn)
$scp -r spark-env.sh slaves hadoop@hyw-spark-2:/usr/local/spark/conf/
$scp -r spark-env.sh slaves hadoop@hyw-spark-3:/usr/local/spark/conf/
7.4、啟動(dòng)spark
7.4.1、啟動(dòng)Master節(jié)點(diǎn)
Master節(jié)點(diǎn)上,執(zhí)行如下命令:
$start-master.sh
使用jps命令在master節(jié)點(diǎn)上可以看到如下進(jìn)程:
10016 Jps
8757 SecondaryNameNode
8997 ResourceManager
7862 DataNode
9112 NodeManager
9832 Master
7723 NameNode
7.4.2、啟動(dòng)worker節(jié)點(diǎn)
Master節(jié)點(diǎn)上,執(zhí)行如下命令:
$start-slaves.sh
使用jps命令在三個(gè)worker節(jié)點(diǎn)上可以看到如下進(jìn)程:
7971 Worker
7486 DataNode
8030 Jps
7.5、通過(guò)瀏覽器查看spark信息
瀏覽器訪問(wèn)http://10.39.60.221:8080,出現(xiàn)如下界面
到此,關(guān)于“Hadoop2.7.5+Spark2.2.1分布式集群怎么搭建”的學(xué)習(xí)就結(jié)束了,希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí),快去試試吧!若想繼續(xù)學(xué)習(xí)更多相關(guān)知識(shí),請(qǐng)繼續(xù)關(guān)注創(chuàng)新互聯(lián)網(wǎng)站,小編會(huì)繼續(xù)努力為大家?guī)?lái)更多實(shí)用的文章!