這篇文章主要介紹“怎么在Hadoop-1.2.1中跑wordcount”,在日常操作中,相信很多人在怎么在Hadoop-1.2.1中跑wordcount問題上存在疑惑,小編查閱了各式資料,整理出簡單好用的操作方法,希望對大家解答”怎么在Hadoop-1.2.1中跑wordcount”的疑惑有所幫助!接下來,請跟著小編一起來學習吧!
為泰州等地區(qū)用戶提供了全套網(wǎng)頁設計制作服務,及泰州網(wǎng)站建設行業(yè)解決方案。主營業(yè)務為網(wǎng)站設計制作、成都做網(wǎng)站、泰州網(wǎng)站設計,以傳統(tǒng)方式定制建設網(wǎng)站,并提供域名空間備案等一條龍服務,秉承以專業(yè)、用心的態(tài)度為用戶提供真誠的服務。我們深信只要達到每一位用戶的要求,就會得到認可,從而選擇與我們長期合作。這樣,我們也可以走得更遠!
1、在主目錄下創(chuàng)建兩個文本文件
[wukong@bd01 ~]$ mkdir test [wukong@bd01 ~]$ cd test [wukong@bd01 test]$ ls [wukong@bd01 test]$ echo "hello world" >text1 [wukong@bd01 test]$ echo "hello hadoop" >text2 [wukong@bd01 test]$ cat text1 hello world [wukong@bd01 test]$ cat text2 hello hadoop
2、啟動Hadoop
[wukong@bd01 bin]$ ./start-all.sh starting namenode, logging to /home/wukong/a_usr/hadoop-1.2.1/libexec/../logs/ha doop-wukong-namenode-bd01.out bd02: starting datanode, logging to /home/wukong/a_usr/hadoop-1.2.1/libexec/../l ogs/hadoop-wukong-datanode-bd02.out bd01: starting secondarynamenode, logging to /home/wukong/a_usr/hadoop-1.2.1/lib exec/../logs/hadoop-wukong-secondarynamenode-bd01.out starting jobtracker, logging to /home/wukong/a_usr/hadoop-1.2.1/libexec/../logs/ hadoop-wukong-jobtracker-bd01.out bd02: starting tasktracker, logging to /home/wukong/a_usr/hadoop-1.2.1/libexec/. ./logs/hadoop-wukong-tasktracker-bd02.out [wukong@bd01 bin]$ jps 1440 Jps 1132 NameNode 1280 SecondaryNameNode 1364 JobTracker
3、把新建的文件夾放到hdfs上
[wukong@bd01 ~]$ a_usr/hadoop-1.2.1/bin/hadoop fs -put ./test test_in [wukong@bd01 ~]$ a_usr/hadoop-1.2.1/bin/hadoop fs -ls ./test_in Found 2 items -rw-r--r-- 1 wukong supergroup 12 2014-07-31 15:38 /user/wukong/test_i n/text1 -rw-r--r-- 1 wukong supergroup 13 2014-07-31 15:38 /user/wukong/test_i n/text2 [wukong@bd01 ~]$ a_usr/hadoop-1.2.1/bin/hadoop fs -ls Found 1 items drwxr-xr-x - wukong supergroup 0 2014-07-31 15:38 /user/wukong/test_i n
4、跑wordcount程序
[wukong@bd01 hadoop-1.2.1]$ bin/hadoop jar hadoop-examples-1.2.1.jar wordcount t est_in test_out 14/07/31 15:43:44 INFO input.FileInputFormat: Total input paths to process : 2 14/07/31 15:43:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/07/31 15:43:44 WARN snappy.LoadSnappy: Snappy native library not loaded 14/07/31 15:43:46 INFO mapred.JobClient: Running job: job_201407311530_0001 14/07/31 15:43:47 INFO mapred.JobClient: map 0% reduce 0% 14/07/31 15:44:11 INFO mapred.JobClient: map 100% reduce 0% 14/07/31 15:44:27 INFO mapred.JobClient: map 100% reduce 100% 14/07/31 15:44:29 INFO mapred.JobClient: Job complete: job_201407311530_0001 14/07/31 15:44:29 INFO mapred.JobClient: Counters: 29 14/07/31 15:44:29 INFO mapred.JobClient: Job Counters 14/07/31 15:44:29 INFO mapred.JobClient: Launched reduce tasks=1 14/07/31 15:44:29 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=43406 14/07/31 15:44:29 INFO mapred.JobClient: Total time spent by all reduces wai ting after reserving slots (ms)=0 14/07/31 15:44:29 INFO mapred.JobClient: Total time spent by all maps waitin g after reserving slots (ms)=0 14/07/31 15:44:29 INFO mapred.JobClient: Launched map tasks=2 14/07/31 15:44:29 INFO mapred.JobClient: Data-local map tasks=2 14/07/31 15:44:29 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=14688 14/07/31 15:44:29 INFO mapred.JobClient: File Output Format Counters 14/07/31 15:44:29 INFO mapred.JobClient: Bytes Written=25 14/07/31 15:44:29 INFO mapred.JobClient: FileSystemCounters 14/07/31 15:44:29 INFO mapred.JobClient: FILE_BYTES_READ=55 14/07/31 15:44:29 INFO mapred.JobClient: HDFS_BYTES_READ=239 14/07/31 15:44:29 INFO mapred.JobClient: FILE_BYTES_WRITTEN=176694 14/07/31 15:44:29 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=25 14/07/31 15:44:29 INFO mapred.JobClient: File Input Format Counters 14/07/31 15:44:29 INFO mapred.JobClient: Bytes Read=25 14/07/31 15:44:29 INFO mapred.JobClient: Map-Reduce Framework 14/07/31 15:44:29 INFO mapred.JobClient: Map output materialized bytes=61 14/07/31 15:44:29 INFO mapred.JobClient: Map input records=2 14/07/31 15:44:29 INFO mapred.JobClient: Reduce shuffle bytes=61 14/07/31 15:44:29 INFO mapred.JobClient: Spilled Records=8 14/07/31 15:44:29 INFO mapred.JobClient: Map output bytes=41 14/07/31 15:44:29 INFO mapred.JobClient: Total committed heap usage (bytes)= 417439744 14/07/31 15:44:29 INFO mapred.JobClient: CPU time spent (ms)=2880 14/07/31 15:44:29 INFO mapred.JobClient: Combine input records=4 14/07/31 15:44:29 INFO mapred.JobClient: SPLIT_RAW_BYTES=214 14/07/31 15:44:29 INFO mapred.JobClient: Reduce input records=4 14/07/31 15:44:29 INFO mapred.JobClient: Reduce input groups=3 14/07/31 15:44:29 INFO mapred.JobClient: Combine output records=4 14/07/31 15:44:29 INFO mapred.JobClient: Physical memory (bytes) snapshot=41 8050048 14/07/31 15:44:29 INFO mapred.JobClient: Reduce output records=3 14/07/31 15:44:29 INFO mapred.JobClient: Virtual memory (bytes) snapshot=217 4017536 14/07/31 15:44:29 INFO mapred.JobClient: Map output records=4
跑完之后可以查看一下
[wukong@bd01 hadoop-1.2.1]$ bin/hadoop fs -ls Found 2 items drwxr-xr-x - wukong supergroup 0 2014-07-31 15:38 /user/wukong/test_in drwxr-xr-x - wukong supergroup 0 2014-07-31 15:44 /user/wukong/test_out [wukong@bd01 hadoop-1.2.1]$ a_usr/hadoop-1.2.1/bin/hadoop fs -ls ./test_out -bash: a_usr/hadoop-1.2.1/bin/hadoop: No such file or directory [wukong@bd01 hadoop-1.2.1]$ bin/hadoop fs -ls ./test_out Found 3 items -rw-r--r-- 1 wukong supergroup 0 2014-07-31 15:44 /user/wukong/test_out/_SUCCESS drwxr-xr-x - wukong supergroup 0 2014-07-31 15:43 /user/wukong/test_out/_logs -rw-r--r-- 1 wukong supergroup 25 2014-07-31 15:44 /user/wukong/test_out/part-r-00000
5、最終的結果就在part-r-00000中!
[wukong@bd01 hadoop-1.2.1]$ bin/hadoop fs -cat ./test_out/part-r-00000 hadoop 1 hello 2 world 1
到此,關于“怎么在Hadoop-1.2.1中跑wordcount”的學習就結束了,希望能夠解決大家的疑惑。理論與實踐的搭配能更好的幫助大家學習,快去試試吧!若想繼續(xù)學習更多相關知識,請繼續(xù)關注創(chuàng)新互聯(lián)網(wǎng)站,小編會繼續(xù)努力為大家?guī)砀鄬嵱玫奈恼拢?/p>
分享名稱:怎么在Hadoop-1.2.1中跑wordcount
標題來源:http://weahome.cn/article/jpecee.html