歡迎訪問:魯春利的工作筆記,學(xué)習(xí)是一種信仰,讓時(shí)間考驗(yàn)堅(jiān)持的力量。
晉中網(wǎng)站制作公司哪家好,找創(chuàng)新互聯(lián)!從網(wǎng)頁設(shè)計(jì)、網(wǎng)站建設(shè)、微信開發(fā)、APP開發(fā)、成都響應(yīng)式網(wǎng)站建設(shè)公司等網(wǎng)站項(xiàng)目制作,到程序開發(fā),運(yùn)營維護(hù)。創(chuàng)新互聯(lián)從2013年成立到現(xiàn)在10年的時(shí)間,我們擁有了豐富的建站經(jīng)驗(yàn)和運(yùn)維經(jīng)驗(yàn),來保證我們的工作的順利進(jìn)行。專注于網(wǎng)站建設(shè)就選創(chuàng)新互聯(lián)。
系統(tǒng):Win7 64位
JEE版本的Eclipse:Luna Release (4.4.0)
Hadoop:2.6.0
Hadoop-plugin:hadoop-eclipse-plugin-2.2.0.jar
0、寫在前面
工作筆記之Hadoop2.6集群搭建 已經(jīng)搭建好了hadoop的集群環(huán)境,通常情況下mapreduce的執(zhí)行需要打成jar包提交到hadoop的集群,但為了測(cè)試的方便,現(xiàn)在準(zhǔn)備具備mapreduce操作的eclipse環(huán)境。
1、插件安裝
將hadoop-eclipse-plugin-2.2.0.jar復(fù)制到eclipse安裝目錄plugins下
2、環(huán)境配置
將hadoop-eclipse-plugin-2.2.0.jar復(fù)制到eclipse安裝目錄plugins下之后重啟eclipse
a.) 查找mapreduce插件
b.) 新建hadoop location
c.) 配置Genernal
參數(shù)說明:
Location name: 自定義的名稱 Map/Reduce(V2) Master : 指集群JobTracker的配置信息 與mapre-site.xml里面的mapreduce.jobtracker.address一致 DFS Master : 與core-site.xml文件里面的fs.defaultFS一致 配置為與Active NameNode一致,配置為cluster會(huì)將cluster作為主機(jī)名解析(解析失?。?User name:配置為我在hadoop集群中使用的用戶hadoop
說明:
Advanced Parameters里面的很多參數(shù)不清楚具體作用,這里就不再調(diào)整。
d.) 驗(yàn)證配置
可以看到hdfs上的目錄了:
3、運(yùn)行wordcount
Eclipse的hadoop插件已經(jīng)集成成功,接下來就跑一個(gè)mapreduce的入門程序wordcount吧。
a.) 新建MapReduce Project
首先需要在本機(jī)解壓hadoop安裝程序,這樣在創(chuàng)建mapreduce程序的時(shí)hadoop依賴的jar包會(huì)被自動(dòng)引入。
b.) 準(zhǔn)備程序
package com.invic.mapreduce.wordcount; import java.io.IOException; import java.util.StringTokenizer; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class MyMapper extends Mapper
package com.invic.mapreduce.wordcount; import java.io.IOException; import java.util.Iterator; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; /** * * @author lucl * */ public class MyReducer extends Reducer{ private static final Log LOG = LogFactory.getLog(MyReducer.class); @Override public void reduce(Text key, Iterable value, Context context) throws IOException, InterruptedException { LOG.info("=====================reducer================"); LOG.info("key " + key + "\tvalue : " + value); int result = 0; for (Iterator it = value.iterator(); it.hasNext(); ) { IntWritable val = it.next(); LOG.info("\t\t : " + val.get()); result += val.get(); } LOG.info("total key : " + key + "\result : " + result); context.write(key, new IntWritable(result)); } }
package com.invic.mapreduce.wordcount; import java.io.IOException; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.conf.Configured; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.util.GenericOptionsParser; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner; /** * * @author lucl * */ public class WordCounterTool extends Configured implements Tool { private static final Log LOG = LogFactory.getLog(WordCounterTool.class); public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException { // 這里需要設(shè)置系統(tǒng)參數(shù),否則會(huì)包winutils.exe的錯(cuò)誤 System.setProperty("hadoop.home.dir", "E:\\hadoop-2.6.0\\hadoop-2.6.0"); try { int exit = ToolRunner.run(new WordCounterTool(), args); LOG.info("result : " + exit); } catch (Exception e) { e.printStackTrace(); } } @Override public int run(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length < 2) { LOG.info("Usage: wordcount[ ...] "); System.exit(2); } Job job = Job.getInstance(); job.setJarByClass(WordCounterTool.class); job.setMapperClass(MyMapper.class); job.setCombinerClass(MyReducer.class); job.setReducerClass(MyReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); for (int i = 0; i < otherArgs.length - 1; ++i) { FileInputFormat.addInputPath(job, new Path(otherArgs[i])); } FileOutputFormat.setOutputPath(job, new Path(otherArgs[otherArgs.length - 1])); return job.waitForCompletion(true) ? 0 : 1; } }
c.) 運(yùn)行MapReduce程序
選中WordCounterTool右鍵Run Configurations配置輸入?yún)?shù),點(diǎn)擊“Run”按鈕
data目錄下file1.txt內(nèi)容為:
hello world hello markhuang hello hadoop
data目錄下file2.txt內(nèi)容為:
hadoop ok hadoop fail hadoop 2.3
d.) 程序報(bào)錯(cuò)
15/07/19 22:17:31 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-Administrator/mapred/staging/Administrator907501946/.staging/job_local907501946_0001 Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method) at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557) at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977) at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:187) at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115) at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131) at org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:163) at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:536) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314) at com.invic.mapreduce.wordcount.WordCounterTool.run(WordCounterTool.java:60) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.invic.mapreduce.wordcount.WordCounterTool.main(WordCounterTool.java:31)
說明:
從網(wǎng)上下載hadoop2.6版本對(duì)應(yīng)的hadoop.dll文件放到C:\Windows\System32目錄下
e.) 再次執(zhí)行
選中WordCounterTool右鍵Run AS --> Run On Hadoop,等一會(huì)后程序執(zhí)行成功。
f.) 查看輸出結(jié)果
總結(jié):插件配置成功。