怎么在Hadoop-1.2.1中跑wordcount

這篇文章主要介紹“怎么在Hadoop-1.2.1中跑wordcount”，在日常操作中，相信很多人在怎么在Hadoop-1.2.1中跑wordcount問題上存在疑惑，小編查閱了各式資料，整理出簡單好用的操作方法，希望對大家解答”怎么在Hadoop-1.2.1中跑wordcount”的疑惑有所幫助！接下來，請跟著小編一起來學習吧！

為泰州等地區(qū)用戶提供了全套網(wǎng)頁設計制作服務，及泰州網(wǎng)站建設行業(yè)解決方案。主營業(yè)務為網(wǎng)站設計制作、成都做網(wǎng)站、泰州網(wǎng)站設計，以傳統(tǒng)方式定制建設網(wǎng)站，并提供域名空間備案等一條龍服務，秉承以專業(yè)、用心的態(tài)度為用戶提供真誠的服務。我們深信只要達到每一位用戶的要求，就會得到認可，從而選擇與我們長期合作。這樣，我們也可以走得更遠！

1、在主目錄下創(chuàng)建兩個文本文件

[wukong@bd01 ~]$ mkdir test
[wukong@bd01 ~]$  cd test
[wukong@bd01 test]$ ls
[wukong@bd01 test]$ echo "hello world" >text1
[wukong@bd01 test]$ echo "hello hadoop" >text2
[wukong@bd01 test]$ cat text1
hello world
[wukong@bd01 test]$ cat text2
hello hadoop

2、啟動Hadoop

[wukong@bd01 bin]$ ./start-all.sh
starting namenode, logging to /home/wukong/a_usr/hadoop-1.2.1/libexec/../logs/ha doop-wukong-namenode-bd01.out
bd02: starting datanode, logging to /home/wukong/a_usr/hadoop-1.2.1/libexec/../l ogs/hadoop-wukong-datanode-bd02.out
bd01: starting secondarynamenode, logging to /home/wukong/a_usr/hadoop-1.2.1/lib exec/../logs/hadoop-wukong-secondarynamenode-bd01.out
starting jobtracker, logging to /home/wukong/a_usr/hadoop-1.2.1/libexec/../logs/ hadoop-wukong-jobtracker-bd01.out
bd02: starting tasktracker, logging to /home/wukong/a_usr/hadoop-1.2.1/libexec/. ./logs/hadoop-wukong-tasktracker-bd02.out
[wukong@bd01 bin]$ jps
1440 Jps
1132 NameNode
1280 SecondaryNameNode
1364 JobTracker

3、把新建的文件夾放到hdfs上

[wukong@bd01 ~]$ a_usr/hadoop-1.2.1/bin/hadoop fs -put ./test test_in
[wukong@bd01 ~]$ a_usr/hadoop-1.2.1/bin/hadoop fs -ls ./test_in
Found 2 items
-rw-r--r--   1 wukong supergroup         12 2014-07-31 15:38 /user/wukong/test_i n/text1
-rw-r--r--   1 wukong supergroup         13 2014-07-31 15:38 /user/wukong/test_i n/text2
[wukong@bd01 ~]$ a_usr/hadoop-1.2.1/bin/hadoop fs -ls
Found 1 items
drwxr-xr-x   - wukong supergroup          0 2014-07-31 15:38 /user/wukong/test_i n

4、跑wordcount程序

[wukong@bd01 hadoop-1.2.1]$ bin/hadoop jar hadoop-examples-1.2.1.jar wordcount t est_in test_out
14/07/31 15:43:44 INFO input.FileInputFormat: Total input paths to process : 2
14/07/31 15:43:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/07/31 15:43:44 WARN snappy.LoadSnappy: Snappy native library not loaded
14/07/31 15:43:46 INFO mapred.JobClient: Running job: job_201407311530_0001
14/07/31 15:43:47 INFO mapred.JobClient:  map 0% reduce 0%
14/07/31 15:44:11 INFO mapred.JobClient:  map 100% reduce 0%
14/07/31 15:44:27 INFO mapred.JobClient:  map 100% reduce 100%
14/07/31 15:44:29 INFO mapred.JobClient: Job complete: job_201407311530_0001
14/07/31 15:44:29 INFO mapred.JobClient: Counters: 29
14/07/31 15:44:29 INFO mapred.JobClient:   Job Counters
14/07/31 15:44:29 INFO mapred.JobClient:     Launched reduce tasks=1
14/07/31 15:44:29 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=43406
14/07/31 15:44:29 INFO mapred.JobClient:     Total time spent by all reduces wai ting after reserving slots (ms)=0
14/07/31 15:44:29 INFO mapred.JobClient:     Total time spent by all maps waitin g after reserving slots (ms)=0
14/07/31 15:44:29 INFO mapred.JobClient:     Launched map tasks=2
14/07/31 15:44:29 INFO mapred.JobClient:     Data-local map tasks=2
14/07/31 15:44:29 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=14688
14/07/31 15:44:29 INFO mapred.JobClient:   File Output Format Counters
14/07/31 15:44:29 INFO mapred.JobClient:     Bytes Written=25
14/07/31 15:44:29 INFO mapred.JobClient:   FileSystemCounters
14/07/31 15:44:29 INFO mapred.JobClient:     FILE_BYTES_READ=55
14/07/31 15:44:29 INFO mapred.JobClient:     HDFS_BYTES_READ=239
14/07/31 15:44:29 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=176694
14/07/31 15:44:29 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=25
14/07/31 15:44:29 INFO mapred.JobClient:   File Input Format Counters
14/07/31 15:44:29 INFO mapred.JobClient:     Bytes Read=25
14/07/31 15:44:29 INFO mapred.JobClient:   Map-Reduce Framework
14/07/31 15:44:29 INFO mapred.JobClient:     Map output materialized bytes=61
14/07/31 15:44:29 INFO mapred.JobClient:     Map input records=2
14/07/31 15:44:29 INFO mapred.JobClient:     Reduce shuffle bytes=61
14/07/31 15:44:29 INFO mapred.JobClient:     Spilled Records=8
14/07/31 15:44:29 INFO mapred.JobClient:     Map output bytes=41
14/07/31 15:44:29 INFO mapred.JobClient:     Total committed heap usage (bytes)= 417439744
14/07/31 15:44:29 INFO mapred.JobClient:     CPU time spent (ms)=2880
14/07/31 15:44:29 INFO mapred.JobClient:     Combine input records=4
14/07/31 15:44:29 INFO mapred.JobClient:     SPLIT_RAW_BYTES=214
14/07/31 15:44:29 INFO mapred.JobClient:     Reduce input records=4
14/07/31 15:44:29 INFO mapred.JobClient:     Reduce input groups=3
14/07/31 15:44:29 INFO mapred.JobClient:     Combine output records=4
14/07/31 15:44:29 INFO mapred.JobClient:     Physical memory (bytes) snapshot=41 8050048
14/07/31 15:44:29 INFO mapred.JobClient:     Reduce output records=3
14/07/31 15:44:29 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=217 4017536
14/07/31 15:44:29 INFO mapred.JobClient:     Map output records=4

跑完之后可以查看一下

[wukong@bd01 hadoop-1.2.1]$ bin/hadoop fs -ls
Found 2 items
drwxr-xr-x   - wukong supergroup          0 2014-07-31 15:38 /user/wukong/test_in
drwxr-xr-x   - wukong supergroup          0 2014-07-31 15:44 /user/wukong/test_out
[wukong@bd01 hadoop-1.2.1]$ a_usr/hadoop-1.2.1/bin/hadoop fs -ls ./test_out
-bash: a_usr/hadoop-1.2.1/bin/hadoop: No such file or directory
[wukong@bd01 hadoop-1.2.1]$ bin/hadoop fs -ls ./test_out
Found 3 items
-rw-r--r--   1 wukong supergroup          0 2014-07-31 15:44 /user/wukong/test_out/_SUCCESS
drwxr-xr-x   - wukong supergroup          0 2014-07-31 15:43 /user/wukong/test_out/_logs
-rw-r--r--   1 wukong supergroup         25 2014-07-31 15:44 /user/wukong/test_out/part-r-00000

5、最終的結果就在part-r-00000中！

[wukong@bd01 hadoop-1.2.1]$ bin/hadoop fs -cat ./test_out/part-r-00000
hadoop  1
hello   2
world   1

到此，關于“怎么在Hadoop-1.2.1中跑wordcount”的學習就結束了，希望能夠解決大家的疑惑。理論與實踐的搭配能更好的幫助大家學習，快去試試吧！若想繼續(xù)學習更多相關知識，請繼續(xù)關注創(chuàng)新互聯(lián)網(wǎng)站，小編會繼續(xù)努力為大家?guī)砀鄬嵱玫奈恼拢?/p>
分享名稱：怎么在Hadoop-1.2.1中跑wordcount
標題來源：http://weahome.cn/article/jpecee.html

真实的国产乱ⅩXXX66竹夫人,五月香六月婷婷激情综合,亚洲日本VA一区二区三区,亚洲精品一区二区三区麻豆

怎么在Hadoop-1.2.1中跑wordcount

其他資訊

網(wǎng)站制作

企業(yè)服務

網(wǎng)站建設

服務器托管