本篇文章給大家分享的是有關(guān)Java中怎么實(shí)現(xiàn)一個(gè)雙數(shù)組Trie樹(shù),小編覺(jué)得挺實(shí)用的,因此分享給大家學(xué)習(xí),希望大家閱讀完這篇文章后可以有所收獲,話不多說(shuō),跟著小編一起來(lái)看看吧。
從事綿陽(yáng)服務(wù)器托管,服務(wù)器租用,云主機(jī),網(wǎng)站空間,域名申請(qǐng),CDN,網(wǎng)絡(luò)代維等服務(wù)。
1.對(duì)于插入字符串,如果有一個(gè)字符串是另一個(gè)字符串的子串的話,我是將結(jié)束符也作為一條邊,產(chǎn)生一個(gè)新的結(jié)點(diǎn),這個(gè)結(jié)點(diǎn)新節(jié)點(diǎn)的Base我置為0
所以一個(gè)字符串結(jié)束也有2中情況:一個(gè)是Base值為負(fù),存儲(chǔ)剩余字符(可能只有一個(gè)結(jié)束符)到Tail數(shù)組;另一個(gè)是Base為0。
所以在查詢的時(shí)候要考慮一下這兩種情況
2.對(duì)于***種沖突(論文中的Case 3),可能要將Tail中的字符串取出一部分,作為邊放到索引中。論文是使用將尾串左移的方式,我的方式直接修改Base值,而不是移動(dòng)尾串。
下面是java實(shí)現(xiàn)的代碼,可以處理相同字符串插入,子串的插入等情況
/* * Name: Double Array Trie * Author: Yaguang Ding * Mail: dingyaguang117@gmail.com * Blog: blog.csdn.net/dingyaguang117 * Date: 2012/5/21 * Note: a word ends may be either of these two case: * 1. Base[cur_p] == pos ( pos<0 and Tail[-pos] == 'END_CHAR' ) * 2. Check[Base[cur_p] + Code('END_CHAR')] == cur_p */ import java.util.ArrayList; import java.util.HashMap; import java.util.Map; import java.util.Arrays; public class DoubleArrayTrie { final char END_CHAR = '\0'; final int DEFAULT_LEN = 1024; int Base[] = new int [DEFAULT_LEN]; int Check[] = new int [DEFAULT_LEN]; char Tail[] = new char [DEFAULT_LEN]; int Pos = 1; MapCharMap = new HashMap (); ArrayList CharList = new ArrayList (); public DoubleArrayTrie() { Base[1] = 1; CharMap.put(END_CHAR,1); CharList.add(END_CHAR); CharList.add(END_CHAR); for(int i=0;i<26;++i) { CharMap.put((char)('a'+i),CharMap.size()+1); CharList.add((char)('a'+i)); } } private void Extend_Array() { Base = Arrays.copyOf(Base, Base.length*2); Check = Arrays.copyOf(Check, Check.length*2); } private void Extend_Tail() { Tail = Arrays.copyOf(Tail, Tail.length*2); } private int GetCharCode(char c) { if (!CharMap.containsKey(c)) { CharMap.put(c,CharMap.size()+1); CharList.add(c); } return CharMap.get(c); } private int CopyToTailArray(String s,int p) { int _Pos = Pos; while(s.length()-p+1 > Tail.length-Pos) { Extend_Tail(); } for(int i=p; i = Base.length) Extend_Array(); if(Base[cur_p]!= 0 || Check[cur_p]!= 0) { flag = false; break; } } if (flag) return i; } } private ArrayList GetChildList(int p) { ArrayList ret = new ArrayList (); for(int i=1; i<=CharMap.size();++i) { if(Base[p]+i >= Check.length) break; if(Check[Base[p]+i] == p) { ret.add(i); } } return ret; } private boolean TailContainString(int start,String s2) { for(int i=0;i = Base.length) Extend_Array(); //空閑狀態(tài) if(Base[cur_p] == 0 && Check[cur_p] == 0) { Base[cur_p] = -Pos; Check[cur_p] = pre_p; Pos = CopyToTailArray(s,i+1); break; }else //已存在狀態(tài) if(Base[cur_p] > 0 && Check[cur_p] == pre_p) { pre_p = cur_p; continue; }else //沖突 1:遇到 Base[cur_p]小于0的,即遇到一個(gè)被壓縮存到Tail中的字符串 if(Base[cur_p] < 0 && Check[cur_p] == pre_p) { int head = -Base[cur_p]; if(s.charAt(i+1)== END_CHAR && Tail[head]==END_CHAR) //插入重復(fù)字符串 { break; } //公共字母的情況,因?yàn)樯弦粋€(gè)判斷已經(jīng)排除了結(jié)束符,所以一定是2個(gè)都不是結(jié)束符 if (Tail[head] == s.charAt(i+1)) { int avail_base = x_check(new Integer[]{GetCharCode(s.charAt(i+1))}); Base[cur_p] = avail_base; Check[avail_base+GetCharCode(s.charAt(i+1))] = cur_p; Base[avail_base+GetCharCode(s.charAt(i+1))] = -(head+1); pre_p = cur_p; continue; } else { //2個(gè)字母不相同的情況,可能有一個(gè)為結(jié)束符 int avail_base ; avail_base = x_check(new Integer[]{GetCharCode(s.charAt(i+1)),GetCharCode(Tail[head])}); Base[cur_p] = avail_base; Check[avail_base+GetCharCode(Tail[head])] = cur_p; Check[avail_base+GetCharCode(s.charAt(i+1))] = cur_p; //Tail 為END_FLAG 的情況 if(Tail[head] == END_CHAR) Base[avail_base+GetCharCode(Tail[head])] = 0; else Base[avail_base+GetCharCode(Tail[head])] = -(head+1); if(s.charAt(i+1) == END_CHAR) Base[avail_base+GetCharCode(s.charAt(i+1))] = 0; else Base[avail_base+GetCharCode(s.charAt(i+1))] = -Pos; Pos = CopyToTailArray(s,i+2); break; } }else //沖突2:當(dāng)前結(jié)點(diǎn)已經(jīng)被占用,需要調(diào)整pre的base if(Check[cur_p] != pre_p) { ArrayList list1 = GetChildList(pre_p); int toBeAdjust; ArrayList list = null; if(true) { toBeAdjust = pre_p; list = list1; } int origin_base = Base[toBeAdjust]; list.add(GetCharCode(s.charAt(i))); int avail_base = x_check((Integer[])list.toArray(new Integer[list.size()])); list.remove(list.size()-1); Base[toBeAdjust] = avail_base; for(int j=0; j 0) { ArrayList subsequence = GetChildList(tmp1); for(int k=0; k GetAllChildWord(int index) { ArrayList result = new ArrayList (); if(Base[index] == 0) { result.add(""); return result; } if(Base[index] < 0) { String r=""; for(int i=-Base[index];Tail[i]!=END_CHAR;++i) { r+= Tail[i]; } result.add(r); return result; } for(int i=1;i<=CharMap.size();++i) { if(Check[Base[index]+i] == index) { for(String s:GetAllChildWord(Base[index]+i)) { result.add(CharList.get(i)+s); } //result.addAll(GetAllChildWord(Base[index]+i)); } } return result; } public ArrayList FindAllWords(String word) { ArrayList result = new ArrayList (); String prefix = ""; FindStruct fs = Find(word); int p = fs.p; if (p == -1) return result; if(Base[p]<0) { String r=""; for(int i=-Base[p];Tail[i]!=END_CHAR;++i) { r+= Tail[i]; } result.add(fs.prefix+r); return result; } if(Base[p] > 0) { ArrayList r = GetAllChildWord(p); for(int i=0;i 測(cè) 試
import java.io.BufferedReader; import java.io.FileInputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Scanner; import javax.xml.crypto.Data; public class Main { public static void main(String[] args) throws Exception { ArrayListwords = new ArrayList (); BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("E:/兔子的試驗(yàn)學(xué)習(xí)中心[課內(nèi)]/ACM大賽/ACM第四屆校賽/E命令提示/words3.dic"))); String s; int num = 0; while((s=reader.readLine()) != null) { words.add(s); num ++; } DoubleArrayTrie dat = new DoubleArrayTrie(); for(String word: words) { dat.Insert(word); } System.out.println(dat.Base.length); System.out.println(dat.Tail.length); Scanner sc = new Scanner(System.in); while(sc.hasNext()) { String word = sc.next(); System.out.println(dat.Exists(word)); System.out.println(dat.FindAllWords(word)); } } } 下面是測(cè)試結(jié)果,構(gòu)造6W英文單詞的DAT,大概需要20秒
我增長(zhǎng)數(shù)組的時(shí)候是每次長(zhǎng)度增加到2倍,初始1024
Base和Check數(shù)組的長(zhǎng)度為131072
Tail的長(zhǎng)度為262144
以上就是Java中怎么實(shí)現(xiàn)一個(gè)雙數(shù)組Trie樹(shù),小編相信有部分知識(shí)點(diǎn)可能是我們?nèi)粘9ぷ鲿?huì)見(jiàn)到或用到的。希望你能通過(guò)這篇文章學(xué)到更多知識(shí)。更多詳情敬請(qǐng)關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道。
當(dāng)前題目:Java中怎么實(shí)現(xiàn)一個(gè)雙數(shù)組Trie樹(shù)
網(wǎng)頁(yè)URL:http://weahome.cn/article/gejodg.html