這篇文章主要介紹了hadoop2x WordCount MapReduce怎么用,具有一定借鑒價值,感興趣的朋友可以參考下,希望大家閱讀完這篇文章之后大有收獲,下面讓小編帶著大家一起了解一下。
10年積累的做網(wǎng)站、成都網(wǎng)站制作經(jīng)驗,可以快速應對客戶對網(wǎng)站的新想法和需求。提供各種問題對應的解決方案。讓選擇我們的客戶得到更好、更有力的網(wǎng)絡服務。我雖然不認識你,你也不認識我。但先制作網(wǎng)站后付款的網(wǎng)站建設流程,更有成華免費網(wǎng)站建設讓你可以放心的選擇與我們合作。
package com.jhl.haoop.examples;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
// map區(qū)域
public static class TokenizerMapper extends
Mapper
private final static IntWritable one = new IntWritable(1);//每個單詞統(tǒng)計一次
private Text word = new Text();
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
//進行分割 [空格 制表符 \t 換行 \n 回車符\r \f]
// public StringTokenizer(String str) {
//this(str, " \t\n\r\f", false);
// }
StringTokenizer itr = new StringTokenizer(value.toString());//獲取每行數(shù)據(jù)的值value.toString()
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());//設置map輸出的key值
context.write(word, one);//上下文輸出map的key和value值
}
}
}
//reduce 區(qū)域
public static class IntSumReducer extends
Reducer
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable
Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {//循環(huán)遍歷Iterable
sum += val.get();//累加
}
result.set(sum);//設置總次數(shù)
context.write(key, result);
}
}
//client區(qū)域
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();//獲取配置信息
//GenericOptionsParser 用來常用的Hadoop命令選項,并根據(jù)需要,為Configuration對象設置相應的取值。
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length != 2) {
System.err.println("Usage: wordcount ");
System.exit(2);
}
Job job = new Job(conf, "WordCount");//創(chuàng)建Job、設置Job配置和名稱
job.setJarByClass(WordCount.class);//設置Job 運行的類
job.setMapperClass(TokenizerMapper.class);//設置Mapper類和Reducer類
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));//設置輸入文件的路徑和輸出文件的路徑
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
job.setOutputKeyClass(Text.class);//設置輸出結果的key和value類型
job.setOutputValueClass(IntWritable.class);
boolean isSuccess = job.waitForCompletion(true);//提交Job,等待運行結果,并在客戶端顯示運行信息
System.exit(isSuccess ? 0 : 1);//結束程序
}
}
感謝你能夠認真閱讀完這篇文章,希望小編分享的“hadoop2x WordCount MapReduce怎么用”這篇文章對大家有幫助,同時也希望大家多多支持創(chuàng)新互聯(lián),關注創(chuàng)新互聯(lián)行業(yè)資訊頻道,更多相關知識等著你來學習!