Problem Solving using C++

Algorithm Study using C++

字符串匹配算法(2)---String Matching Algorithm

前面已經(jīng)說到了naive的方法來匹配字符串，但是該方法的特點(diǎn)是復(fù)雜度高，未能充分利用計(jì)算得到的值作為下一步計(jì)算的結(jié)果，因此，復(fù)雜度相當(dāng)比較高。
Rabin-Karp提出了新的字符串匹配方法：
Rabin-Karp方法主要是分2部分：
(1)預(yù)處理(pre-processing)
對(duì)pattern字符串的m個(gè)進(jìn)行計(jì)算，得到其hash值。復(fù)雜度為O(m)
(2)匹配(matching)
for(int i=0;i<n-m;i++)
{
    計(jì)算t[i..i+m]的hash值。
    再同pattern字符串的hash值進(jìn)行對(duì)比，如果相同，則使用naive辦法，繼續(xù)一個(gè)字符一個(gè)字符地比較。
    使用Horner法則計(jì)算下一個(gè)m字符的hash值。(即h(s+1)=f(h(s))).
}
整個(gè)算法的復(fù)雜度：
在最差的情況下，復(fù)雜度為O(m)+O(m*(n-m+1))
期望的情況為O(n-m+1+cm)=O(n+m).

實(shí)驗(yàn)代碼如下：（暫時(shí)只是考慮支持12345678901234567890 匹配 234等的情況）

1 #include <iostream>
2 #include <string>
3
4 using namespace std;
5
6 #define Q    997 //Should be a prime
7 #define D    10    //|sizeof character set|
8
9 inline int geth(int m)//h=d^(m-1)%q
10 {
11     int len=1;
12     for(int i=0;i<(m-1);i++)
13     {
14         len*=D;
15     }
16     return len%Q;
17 }
18
19 int main(int argc,char* argv[])
20 {
21     char *t=NULL,*p=NULL;
22     string text,pattern;
23     int h,p0=0,t0=0;
24
25     while(1)
26     {
27         int index = 0;
28
29         cin>>text>>pattern;
30         int n = text.length();
31         int m = pattern.length();
32
33         //t= new char[n+1];
34         //p= new char[m+1];
35         t= new char[n];
36         p= new char[m];
37
38         h = geth(m);
39         p0=t0=0;
40
41         //strcpy(t,text.c_str());
42         //strcpy(p,pattern.c_str());
43         memcpy(t,text.data(),n);
44         memcpy(p,pattern.data(),m);
45
46         for(int i=0;i<m;i++)//get the initial value from pattern and text
47         {
48             p0 =(D*p0+(p[i]-'0'))%Q;
49             t0 = (D*t0+(t[i]-'0'))%Q;
50         }
51
52         for(int i=0;i<(n-m);i++)
53         {
54             if(p0==t0)//should call the naive algorithm to check whether the two string is the same
55             {
56                 bool match = true;
57                 for(int j=0;j<m;j++)
58                 {
59                     if(p[j]!=t[i+j])
60                         match = false;
61                 }
62
63                 if(match)
64                     cout<<i<<endl;
65             }
66
67             t0 = ((D*(t0-(t[i]-'0')*h)+t[i+m])-'0')%Q;
68         }
69
70         delete[] t;
71         delete[] p;
72     }
73
74     system("pause");
75     return 0;
76 }
77

posted on 2007-08-20 21:11 Kingoal Lee's Alogrithm Study using cplusplus 閱讀(469) 評(píng)論(0) 編輯收藏引用

只有注冊用戶登錄后才能發(fā)表評(píng)論。
【推薦】100%開源！大型工業(yè)跨平臺(tái)軟件C++源碼提供，建模，組態(tài)！



網(wǎng)站導(dǎo)航: 博客園 IT新聞 BlogJava 博問 Chat2DB 管理

青青草原综合久久大伊人导航_色综合久久天天综合_日日噜噜夜夜狠狠久久丁香五月_热久久这里只有精品

Problem Solving using C++

字符串匹配算法(2)---String Matching Algorithm

My Links

Blog Stats

常用鏈接

留言簿(1)

隨筆檔案

搜索

最新評(píng)論

閱讀排行榜

評(píng)論排行榜