mapreduce中怎么實現(xiàn)K-M類聚

mapreduce中怎么實現(xiàn)K-M類聚，針對這個問題，這篇文章詳細(xì)介紹了相對應(yīng)的分析和解答，希望可以幫助更多想解決這個問題的小伙伴找到更簡單易行的方法。

從網(wǎng)站建設(shè)到定制行業(yè)解決方案，為提供網(wǎng)站制作、成都網(wǎng)站制作服務(wù)體系，各種行業(yè)企業(yè)客戶提供網(wǎng)站建設(shè)解決方案，助力業(yè)務(wù)快速發(fā)展。成都創(chuàng)新互聯(lián)將不斷加快創(chuàng)新步伐，提供優(yōu)質(zhì)的建站服務(wù)。

首先是map

public static class KMmap extends Mapper{
        //中心集合
        //這里的聚簇集合是自己設(shè)定的    centersPath就是集合在hdfs中存放的路徑
        ArrayList> centers = null;
        //用k個中心
        int k = 0;
        //讀取中心
        protected void setup(Context context)throws IOException, InterruptedException {
            //getCentersFromHDFS方法就是傳入一個Path，得到一個ArrayList>集合
             centers = Utils.getCentersFromHDFS(context.getConfiguration().get("centersPath"),false);
             k = centers.size();
        }
         /**
          * 1.每次讀取一條要分類的條記錄與中心做對比，歸類到對應(yīng)的中心
          * 2.以中心ID為key，中心包含的記錄為value輸出(例如： 1 0.2 。  1為聚類中心的ID，0.2為靠近聚類中心的某個值)
          */
        @Override
        protected void map(LongWritable key, Text value,Context context)
                throws IOException, InterruptedException {
            ArrayList fileds = Utils.textToArray(value);
            //textToArray方法將map進(jìn)來的一行value根據(jù)“,”分割后轉(zhuǎn)化為ArrayList的集合
            int sizeOfFileds = fileds.size();
            double minDistance = 99999999;
            int centerIndex = 0;
            //依次取出k個中心點與當(dāng)前讀取的記錄做計算
            for(int i=0;ireduce
    //利用reduce的歸并功能以中心為Key將記錄歸并到一起
    public static class KMreduce extends Reducer{

          /**
            * 1.Key為聚類中心的ID value為該中心的記錄集合
            * 2.計數(shù)所有記錄元素的平均值，求出新的中心
            */
        
        protected void reduce(IntWritable key, Iterable values,
    Context context)throws IOException, InterruptedException {
             ArrayList> filedsList = new ArrayList>();
            //依次讀取記錄集，每行為一個ArrayList
             for(Iterator it = values.iterator();it.hasNext();){
                 ArrayList tempList = Utils.textToArray(it.next());
                 filedsList.add(tempList);
             }
             //計算新的中心
             //每行的元素個數(shù)
             int filedSize = filedsList.get(0).size();
             double[] avg = new double[filedSize];
             for(int i=0;i最后是其中所用到的util類，主要是提供一些讀取文件和操作字符串的方法
public class Utils {
    
    //讀取中心文件的數(shù)據(jù)
    public static ArrayList> getCentersFromHDFS(String centersPath,boolean isDirectory)
                    throws IOException{
        ArrayList> result = new ArrayList>();
        Path path = new Path(centersPath);
        Configuration conf = new Configuration();
                  
        FileSystem fileSystem = path.getFileSystem(conf);
        
        if(isDirectory){    
            FileStatus[] listFile = fileSystem.listStatus(path);
            for (int i = 0; i < listFile.length; i++) {
                result.addAll(getCentersFromHDFS(listFile[i].getPath().toString(),false));
                }
            return result;
        }
        FSDataInputStream fsis = fileSystem.open(path);
        LineReader lineReader = new LineReader(fsis, conf);
        Text line = new Text();
          while(lineReader.readLine(line) > 0){
                      ArrayList tempList = textToArray(line);
                          result.add(tempList);
                      }
                      lineReader.close();
            return result;
    }
    
    //刪掉文件
     public static void deletePath(String pathStr) throws IOException{
                Configuration conf = new Configuration();
                Path path = new Path(pathStr);
                FileSystem hdfs = path.getFileSystem(conf);
                hdfs.delete(path ,true);
              }
     
     
     public static ArrayList textToArray(Text text){
          ArrayList list = new ArrayList();
          String[] fileds = text.toString().split("\t");
          for(int i=0;i> oldCenters = Utils.getCentersFromHDFS(centerPath,false);
                  List> newCenters = Utils.getCentersFromHDFS(newPath,true);
                  
                    int size = oldCenters.size();
                    int fildSize = oldCenters.get(0).size();
                    double distance = 0;
                    for(int i=0;i關(guān)于mapreduce中怎么實現(xiàn)K-M類聚問題的解答就分享到這里了，希望以上內(nèi)容可以對大家有一定的幫助，如果你還有很多疑惑沒有解開，可以關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道了解更多相關(guān)知識。            
            
                        

            分享文章：mapreduce中怎么實現(xiàn)K-M類聚            

            當(dāng)前地址：http://weahome.cn/article/jpegdj.html

真实的国产乱ⅩXXX66竹夫人,五月香六月婷婷激情综合,亚洲日本VA一区二区三区,亚洲精品一区二区三区麻豆

mapreduce中怎么實現(xiàn)K-M類聚

其他資訊

網(wǎng)站制作

企業(yè)服務(wù)

網(wǎng)站建設(shè)

服務(wù)器托管