真实的国产乱ⅩXXX66竹夫人,五月香六月婷婷激情综合,亚洲日本VA一区二区三区,亚洲精品一区二区三区麻豆

成都創(chuàng)新互聯(lián)網站制作重慶分公司

搭建部署Hadoop之HDFS

HDFS Hadoop 分布式文件系統(tǒng)

站在用戶的角度思考問題,與客戶深入溝通,找到鐘山網站設計與鐘山網站推廣的解決方案,憑借多年的經驗,讓設計與互聯(lián)網技術結合,創(chuàng)造個性化、用戶體驗好的作品,建站類型包括:成都網站制作、成都做網站、外貿營銷網站建設、企業(yè)官網、英文網站、手機端網站、網站推廣、主機域名、網頁空間、企業(yè)郵箱。業(yè)務覆蓋鐘山地區(qū)。


分布式文件系統(tǒng)

分布式文件系統(tǒng)可以有效解決數據的存儲和管理難題

– 將固定于某個地點的某個文件系統(tǒng),擴展到任意多個地點/多個文件系統(tǒng)

– 眾多的節(jié)點組成一個文件系統(tǒng)網絡

– 每個節(jié)點可以分布在不同的地點,通過網絡進行節(jié)點間的通信和數據傳輸

– 人們在使用分布式文件系統(tǒng)時,無需關心數據是存儲在哪個節(jié)點上、或者是從哪個節(jié)點從獲取的,只需要像使用本地文件系統(tǒng)一樣管理和存儲文件系統(tǒng)中的數據

HDFS 角色及概念

? 是Hadoop體系中數據存儲管理的基礎。它是一個高度容錯的系統(tǒng),用于在低成本的通用硬件上運行。

? 角色和概念

    – Client

    – Namenode

    – Secondarynode

    – Datanode

? NameNode

    – Master節(jié)點,管理HDFS的名稱空間和數據塊映射信息,配置副本策略,處理所有客戶端請求。

? Secondary NameNode

    – 定期合并 fsimage 和fsedits,推送給NameNode

    – 緊急情況下,可輔助恢復NameNode,

? 但Secondary NameNode并非NameNode的熱備。

? DataNode

    – 數據存儲節(jié)點,存儲實際的數據

    – 匯報存儲信息給NameNode。

? Client

    – 切分文件

    – 訪問HDFS

    – 與NameNode交互,獲取文件位置信息

    – 與DataNode交互,讀取和寫入數據。

? Block

    – 每塊缺省64MB大小

    – 每塊可以多個副本    

搭建部署Hadoop 之 HDFS

搭建部署 HDFS 分布式文件系統(tǒng)

實驗環(huán)境準備:

# vim /etc/hosts

    .. ..

    192.168.4.1master

    192.168.4.2node1

    192.168.4.3node2

    192.168.4.4node3

# sed -ri  "/Host */aStrictHostKeyChecking no" /etc/ssh/ssh_config

# ssh-keygen

# for i in {1..4} 

> do

> ssh-copy-id 192.168.4.${i}

> done

# for i in {1..4}        //同步本地域名

> do

> rsync -a /etc/hosts 192.168.4.${i}:/etc/hosts

> done

# rm -rf /etc/yum.repos.d/*

# vim /etc/yum.repos.d/yum.repo   //配置網絡yum

    [yum]

    name=yum

    baseurl=http://192.168.4.254/rhel7

    gpgcheck=0

# for i in {2..4}

> do

> ssh 192.168.4.${i} "rm -rf /etc/yum.repos.d/*"

> rsync -a /etc/yum.repos.d/yum.repo 192.168.4.${i}:/etc/yum.repos.d/

> done

# for i in {1..4}

> do

> ssh 192.168.4.${i} 'sed -ri "s/^(SELINUX=).*/\1disabled/" /etc/selinux/config ; yum -y remove firewalld' 

> done

//所有機器重啟  

搭建完全分布式

系統(tǒng)規(guī)劃:

主機                                            角色                                                            軟件 

192.168.4.1    master             NameNode  SecondaryNameNode        HDFS

192.168.4.2     node1             DataNode                                                    HDFS

192.168.4.3     node2             DataNode                                                    HDFS

192.168.4.4     node3             DataNode                                                    HDFS

在所有系統(tǒng)上安裝java 環(huán)境和調試工具jtarps

# for i in {1..4}

> do

> ssh 192.168.4.${i} "yum -y install java-1.8.0-openjdk-devel.x86_64"

> done

# which java

/usr/bin/java

# readlink -f /usr/bin/java

/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64/jre/bin/java

安裝 hadoop

# tar -xf hadoop-2.7.3.tar.gz

# mv hadoop-2.7.3 /usr/local/hadoop

修改配置

# cd /usr/local/hadoop/

# sed -ri "s;(export JAVA_HOME=).*;\1/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64/jre;" etc/hadoop/hadoop-env.sh

# sed -ri "s;(export HADOOP_CONF_DIR=).*;\1/usr/local/hadoop/etc/hadoop;" etc/hadoop/hadoop-env.sh

# sed -n "25p;33p" etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64/jre

export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

//配置參數說明 網站http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-common/core-default.xml

# vim etc/hadoop/core-site.xml

.. .. 

 

    fs.defaultFS                    //默認的文件系統(tǒng)

    hdfs://master:9000

 

 

    hadoop.tmp.dir                //所有程序存放位置 hadoop根目錄

    /var/hadoop

 

//所有機器上創(chuàng)建 根目錄

# for i in {1..4}

> do

> ssh 192.168.4.${i} "mkdir /var/hadoop"

> done

//配置參數說明 網站http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

# vim etc/hadoop/hdfs-site.xml

 

    dfs.namenode.http-address        //配置namenode 地址

    master:50070

 

 

    dfs.namenode.secondary.http-address        //配置 secondarynamenode 地址

    master:50090

 

 

    dfs.replication                //配置數據存儲幾份

    2

 

# vim etc/hadoop/slaves         //配置去那些主機上尋找 DataNode 

node1

node2 

node3

配置完成以后,把 hadoop 的文件夾拷貝到所有機器

# for i in {2..4}

> do

> rsync -azSH --delete /usr/local/hadoop 192.168.4.${i}:/usr/local/ -e "ssh"

> done

//在 NameNode 下執(zhí)行格式化 Hadoop

# ./bin/hdfs namenode -format

看見 successfully formatted.   說明 格式化成功了

//在沒有報錯的情況下 啟動集群

# ./sbin/start-dfs.sh   

啟動以后分別在 namenode 和 datanode執(zhí)行命令

# for i in master node{1..3}

> do

> echo $i

> ssh ${i} "jps"

> done

master

4562 SecondaryNameNode

4827 NameNode

5149 Jps

node1

3959 DataNode

4105 Jps

node2

3957 Jps

3803 DataNode

node3

3956 Jps

3803 DataNode

# ./bin/hdfs dfsadmin -report                //查看注冊成功的節(jié)點 

    Configured Capacity: 160982630400 (149.93 GB)

    Present Capacity: 150644051968 (140.30 GB)

    DFS Remaining: 150644039680 (140.30 GB)

    DFS Used: 12288 (12 KB)

    DFS Used%: 0.00%

    Under replicated blocks: 0

    Blocks with corrupt replicas: 0

    Missing blocks: 0

    Missing blocks (with replication factor 1): 0

    

    -------------------------------------------------

    Live datanodes (3):

    

    Name: 192.168.4.2:50010 (node1)

    Hostname: node1

    Decommission Status : Normal

    Configured Capacity: 53660876800 (49.98 GB)

    DFS Used: 4096 (4 KB)

    Non DFS Used: 3446755328 (3.21 GB)

    DFS Remaining: 50214117376 (46.77 GB)

    DFS Used%: 0.00%

    DFS Remaining%: 93.58%

    Configured Cache Capacity: 0 (0 B)

    Cache Used: 0 (0 B)

    Cache Remaining: 0 (0 B)

    Cache Used%: 100.00%

    Cache Remaining%: 0.00%

    Xceivers: 1

    Last contact: Mon Jan 29 21:17:39 EST 2018

    

    

    Name: 192.168.4.4:50010 (node3)

    Hostname: node3

    Decommission Status : Normal

    Configured Capacity: 53660876800 (49.98 GB)

    DFS Used: 4096 (4 KB)

    Non DFS Used: 3445944320 (3.21 GB)

    DFS Remaining: 50214928384 (46.77 GB)

    DFS Used%: 0.00%

    DFS Remaining%: 93.58%

    Configured Cache Capacity: 0 (0 B)

    Cache Used: 0 (0 B)

    Cache Remaining: 0 (0 B)

    Cache Used%: 100.00%

    Cache Remaining%: 0.00%

    Xceivers: 1

    Last contact: Mon Jan 29 21:17:39 EST 2018

    

    

    Name: 192.168.4.3:50010 (node2)

    Hostname: node2

    Decommission Status : Normal

    Configured Capacity: 53660876800 (49.98 GB)

    DFS Used: 4096 (4 KB)

    Non DFS Used: 3445878784 (3.21 GB)

    DFS Remaining: 50214993920 (46.77 GB)

    DFS Used%: 0.00%

    DFS Remaining%: 93.58%

    Configured Cache Capacity: 0 (0 B)

    Cache Used: 0 (0 B)

    Cache Remaining: 0 (0 B)

    Cache Used%: 100.00%

    Cache Remaining%: 0.00%

    Xceivers: 1

    Last contact: Mon Jan 29 21:17:39 EST 2018

namenode

搭建部署Hadoop 之 HDFS

secondarynamenode

搭建部署Hadoop 之 HDFS

datanode

搭建部署Hadoop 之 HDFS

HDFS 基本使用

HDFS 基本命令 幾乎和shell命令相同

# ./bin/hadoop fs -ls hdfs://master:9000/

# ./bin/hadoop fs -mkdir /test

# ./bin/hadoop fs -ls /

Found 1 items

drwxr-xr-x   - root supergroup          0 2018-01-29 21:35 /test

# ./bin/hadoop fs -rmdir /test

# ./bin/hadoop fs -mkdir /input

# ./bin/hadoop fs -put *.txt /input                    //上傳文件

# ./bin/hadoop fs -ls /input

Found 3 items

-rw-r--r--   2 root supergroup      84854 2018-01-29 21:37 /input/LICENSE.txt

-rw-r--r--   2 root supergroup      14978 2018-01-29 21:37 /input/NOTICE.txt

-rw-r--r--   2 root supergroup       1366 2018-01-29 21:37 /input/README.txt

# ./bin/hadoop fs -get /input/README.txt /root/            //下載文件

# ls /root/README.txt 

/root/README.txt

HDFS 增加節(jié)點

– 1. 配置所有hadoop環(huán)境,包括主機名、ssh免密碼登錄、禁用 selinux、iptables、安裝 java 環(huán)境

[root@newnode ~]# yum -y install java-1.8.0-openjdk-devel.x86_64 

[root@master ~] # cat /etc/hosts

192.168.4.1 master

192.168.4.2 node1

192.168.4.3 node2

192.168.4.4 node3

192.168.4.5  newnode

– 2. 修改namenode的slaves文件增加該節(jié)點

[root@master ~]# cd /usr/local/hadoop/etc/hadoop/

[root@master hadoop]# echo newnode >> slaves 

– 3. 把namnode的配置文件復制到配置文件目錄下

# cat /root/rsyncfile.sh 

#!/bin/bash

for i in node{2..4}

do

  rsync -azSH --delete /usr/local/hadoop/etc/hadoop ${i}:/usr/local/hadoop/etc/ -e 'ssh' &

done

wait

[root@master hadoop]# bash /root/rsyncfile.sh

[root@newnode ~]# rsync -azSH --delete master:/usr/local/hadoop /usr/local

– 5. 在該節(jié)點啟動Datanode

[root@newnode ~]# cd /usr/local/hadoop/

[root@newnode hadoop]# ./sbin/hadoop-daemon.sh start datanode

[root@newnode hadoop]# jps

4007 Jps

3705 DataNode

– 6. 查看集群狀態(tài)

[root@master hadoop]# cd /usr/local/hadoop/

[root@master hadoop]# ./bin/hdfs dfsadmin -report

Safe mode is ON

Configured Capacity: 268304384000 (249.88 GB)

Present Capacity: 249863049216 (232.70 GB)

DFS Remaining: 249862311936 (232.70 GB)

DFS Used: 737280 (720 KB)

DFS Used%: 0.00%

Under replicated blocks: 0

Blocks with corrupt replicas: 0

Missing blocks: 0

Missing blocks (with replication factor 1): 0

-------------------------------------------------

Live datanodes (5):

...

Name: 192.168.4.5:50010 (newnode)

Hostname: newnode

Decommission Status : Normal

Configured Capacity: 53660876800 (49.98 GB)

DFS Used: 4096 (4 KB)

Non DFS Used: 3662835712 (3.41 GB)

DFS Remaining: 49998036992 (46.56 GB)

DFS Used%: 0.00%

DFS Remaining%: 93.17%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Sun Jan 28 20:30:23 EST 2018

...

– 7. 設置同步帶寬,并同步數據

[root@master hadoop]# ./bin/hdfs dfsadmin -setBalancerBandwidth 67108864

[root@master hadoop]# ./sbin/start-balancer.sh -threshold 5

縮減節(jié)點

– 配置NameNode的hdfs-site.xml

– dfs.replication 副本數量

– 增加 dfs.hosts.exclude 配置

[root@master hadoop]# vim etc/hadoop/hdfs-site.xml 

...

 

    dfs.hosts.exclude

    /usr/local/hadoop/etc/hadoop/exclude

 

...

– 增加 exclude 配置文件,寫入要刪除的節(jié)點 ip

[root@master hadoop]# vim etc/hadoop/slaves 

node1

node2 

node3

[root@master hadoop]# vim  etc/hadoop/exclude

newnode

# cat /root/rsyncfile.sh 

#!/bin/bash

for i in node{1..5}

do

  rsync -azSH --delete /usr/local/hadoop/etc/hadoop ${i}:/usr/local/hadoop/etc/ -e 'ssh' &

done

wait

[root@master hadoop]# bash /root/rsyncfile.sh

[root@master hadoop]# ./bin/hdfs dfsadmin -refreshNodes

[root@master hadoop]# ./bin/hdfs dfsadmin -report

...

Name: 192.168.4.6:50010 (newnode)

Hostname: newnode

Decommission Status : Decommission in progress        //數據遷移狀態(tài)

Configured Capacity: 53660876800 (49.98 GB)

DFS Used: 12288 (12 KB)

Non DFS Used: 3662950400 (3.41 GB)

DFS Remaining: 49997914112 (46.56 GB)

DFS Used%: 0.00%

DFS Remaining%: 93.17%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Sun Jan 28 20:52:01 EST 2018

...

[root@master hadoop]# ./bin/hdfs dfsadmin -report

...

Name: 192.168.4.6:50010 (newnode)

Hostname: newnode

Decommission Status : Decommissioned                    //最終狀態(tài)

Configured Capacity: 53660876800 (49.98 GB)

DFS Used: 12288 (12 KB)

Non DFS Used: 3662950400 (3.41 GB)

DFS Remaining: 49997914112 (46.56 GB)

DFS Used%: 0.00%

DFS Remaining%: 93.17%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Sun Jan 28 20:52:43 EST 2018

...

//當節(jié)點狀態(tài)變?yōu)?Decommissioned 狀態(tài)時 才能停止節(jié)點 

[root@newnode hadoop]# ./sbin/hadoop-daemon.sh stop datanode

[root@newnode hadoop]# jps

4045 Jps


分享名稱:搭建部署Hadoop之HDFS
本文來源:http://weahome.cn/article/pecscj.html

其他資訊

在線咨詢

微信咨詢

電話咨詢

028-86922220(工作日)

18980820575(7×24)

提交需求

返回頂部