測試環(huán)境,由于測試需求,重新format namenode后,導(dǎo)致datanode無法正常啟動。
在靖遠(yuǎn)等地區(qū),都構(gòu)建了全面的區(qū)域性戰(zhàn)略布局,加強(qiáng)發(fā)展的系統(tǒng)性、市場前瞻性、產(chǎn)品創(chuàng)新能力,以專注、極致的服務(wù)理念,為客戶提供成都做網(wǎng)站、成都網(wǎng)站建設(shè) 網(wǎng)站設(shè)計制作定制網(wǎng)站制作,公司網(wǎng)站建設(shè),企業(yè)網(wǎng)站建設(shè),高端網(wǎng)站設(shè)計,全網(wǎng)整合營銷推廣,外貿(mào)網(wǎng)站制作,靖遠(yuǎn)網(wǎng)站建設(shè)費(fèi)用合理。
1. 查看datanode日志,可以發(fā)現(xiàn)錯誤“Initialization failed for Block pool
2018-01-27 20:09:49,052 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool(Datanode Uuid unassigned) service to c6704/192.168.67.104:9000. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1361) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1326) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:801) at java.lang.Thread.run(Thread.java:745) 2018-01-27 20:09:49,056 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to c6705/192.168.67.105:9000. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1361) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1326) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:801) at java.lang.Thread.run(Thread.java:745) 2018-01-27 20:09:49,069 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool (Datanode Uuid unassigned) service to c6705/192.168.67.105:9000 2018-01-27 20:09:49,070 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool (Datanode Uuid unassigned) service to c6704/192.168.67.104:9000 2018-01-27 20:09:49,192 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool (Datanode Uuid unassigned) 2018-01-27 20:09:51,193 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode 2018-01-27 20:09:51,204 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0 2018-01-27 20:09:51,208 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at c6706.python279.org/192.168.67.106 ************************************************************/
2. 經(jīng)過百度,根據(jù)日志描述,原因是datanode的clusterID 和 namenode的clusterID 不匹配。
打開hdfs-site.xml中關(guān)于datanode和namenode對應(yīng)的目錄,分別打開其中的current/VERSION文件,進(jìn)行對比。
3. namenode的VERSION內(nèi)容如下:
[hdfs@c6704 $ cat /data/hadoop/hdfs/name/current/VERSION #Sat Jan 27 00:46:30 UTC 2018 namespaceID=1148548909 clusterID=CID-aedb2e82-77f2-4056-b676-dca88083215d cTime=0 storageType=NAME_NODE blockpoolID=BP-1099214307-192.168.67.104-1517013990445 layoutVersion=-63
4. datanode的VERSION文件內(nèi)容如下:
[hdfs@c6706 ~]$ cat /data/hadoop/hdfs/data/current/VERSION #Sat Jan 27 00:20:21 UTC 2018 storageID=DS-8f0fdd04-e967-43cd-bd41-93b826b675b8 clusterID=CID-b27ecfd8-64ba-4e43-bd82-4ef6f2edd60c cTime=0 datanodeUuid=264b1b43-82c0-411c-859f-32761edc7465 storageType=DATA_NODE layoutVersion=-56 5. namenode和datano
de的版本是不同的,決定備份datanode,并清空VERSION,然后啟動datanode,問題依舊。檢查VERSION,內(nèi)容是空的。
[hdfs@c6706 current]$ cp VERSION VERSION.bk [hdfs@c6706 current]$ echo > VERSION [hdfs@c6706 current]$ cat VERSION
6. 刪除VERSION,再次啟動datanode,VERSION內(nèi)容已經(jīng)同步。
$ cat VERSION #Sun Jan 28 01:29:46 UTC 2018 storageID=DS-1c1f5e05-df2c-40de-b39b-d6d54e3c4894 clusterID=CID-aedb2e82-77f2-4056-b676-dca88083215d ##<<<<<同步了 cTime=0 datanodeUuid=948d5780-053e-4752-9476-fb1d1debda72 storageType=DATA_NODE layoutVersion=-56
7. 通過頁面也可以查詢到datanode了。
8. 問題原因
執(zhí)行hdfs namenode -format后,current目錄會刪除并重新生成,其中VERSION文件中的clusterID也會隨之變化,而datanode的VERSION文件中的clusterID保持不變,造成兩個clusterID不一致。
所以為了避免這種情況,可以再執(zhí)行的namenode格式化之后,刪除datanode的current文件夾,或者修改datanode的VERSION文件中出clusterID與namenode的VERSION文件中的clusterID一樣,然后重新啟動datanode。
參考:
http://blog.csdn.net/liuxinghao/article/details/40121843