老魚頭昨天向俺們推薦了 Ragel State Machine Compiler 這個東東,一個可以生成協(xié)議處理代碼的工具。還舉了個例子,簡簡單單的幾行代碼:
int atoi( char *str )
{
char *p = str;
int cs, val = 0;
bool neg = false;
%%{
action see_neg {
neg = true;
}
action add_digit {
val = val * 10 + (fc - '0');
}
main :=
( '-'@see_neg | '+' )? ( digit @add_digit )+
'\n' @{ fbreak; };
# Initialize and execute.
write init;
write exec noend;
}%%
if ( neg )
val = -1 * val;
if ( cs < atoi_first_final )
cerr << "atoi: there was an error" << endl;
return val;
};
生成的基于狀態(tài)機(jī)的代碼,比c里面那500多行實(shí)現(xiàn)的atoi函數(shù)更加高效。比如上面這段代碼,會生成下面的c代碼:
int atoi( char *str )
{
char *p = str;
int cs, val = 0;
bool neg = false;
#line 27 "atoi.c"
{
cs = atoi_start;
}
#line 31 "atoi.c"
{
switch ( cs )
{
case 1:
switch( (*p) ) {
case 43: goto st2;
case 45: goto tr2;
}
if ( 48 <= (*p) && (*p) <= 57 )
goto tr3;
goto st0;
st0:
goto _out0;
tr2:
#line 23 "atoi.rl"
{
neg = true;
}
goto st2;
st2:
p += 1;
case 2:
#line 52 "atoi.c"
if ( 48 <= (*p) && (*p) <= 57 )
goto tr3;
goto st0;
tr3:
#line 27 "atoi.rl"
{
val = val * 10 + ((*p) - '0');
}
goto st3;
st3:
p += 1;
case 3:
#line 63 "atoi.c"
if ( (*p) == 10 )
goto tr4;
if ( 48 <= (*p) && (*p) <= 57 )
goto tr3;
goto st0;
tr4:
#line 33 "atoi.rl"
{ goto _out4; }
goto st4;
st4:
p += 1;
case 4:
#line 74 "atoi.c"
goto st0;
}
_out0: cs = 0; goto _out;
_out4: cs = 4; goto _out;
_out: {}
}
#line 38 "atoi.rl"
if ( neg )
val = -1 * val;
if ( cs < atoi_first_final )
cerr << "atoi: there was an error" << endl;
return val;
};
他說,Nginx里面花了好幾百行來實(shí)現(xiàn)HTTP協(xié)議的解析,用Ragel,100多行就可以搞定了,效率更高,人肉優(yōu)化器不值錢了(參見網(wǎng)址上面的 http11_parser.rl 代碼)
今天試了一下,用來寫一個判斷一個Java String是否為數(shù)字串:
public class IsInt
{
%%{
machine is_int;
write data noerror;
}%%
public static void main(String[] args)
{
long begin = System.currentTimeMillis();
for (int i=0; i<100000000; i++) {
isIntStr("123456789p");
isIntStr("8487389247");
}
System.out.println(System.currentTimeMillis() - begin);
begin = System.currentTimeMillis();
for (int i=0; i<100000000; i++) {
isAllNumber("123456789p");
isAllNumber("8487389247");
}
System.out.println(System.currentTimeMillis() - begin);
}
public static boolean isAllNumber(String str)
{
char[] c = str.toCharArray();
boolean blReturn = true;
for(int ni=0; ni<c.length; ni++)
{
if(c[ni]<48 || c[ni]>57)
{
blReturn = false;
break;
}
}
return blReturn;
}
public static boolean isIntStr(String str)
{
char[] data = str.toCharArray();
int p=0, cs=0;
boolean isInt = true;
%%{
main := (digit+)? any @{ isInt = false; fbreak; };
write init;
write exec noend;
}%%
return isInt;
}
}
使用 ragel.exe -J IsInt.rl | rlgen-java.exe 命令生成 java 代碼,編譯運(yùn)行,結(jié)果是:
27750
30938
可見生成的代碼比那簡單實(shí)現(xiàn)的更高:)
在RoR架構(gòu)上面使用的Mongrel服務(wù)器,原來也是使用了Ragel