久久国产精品-国产精品,久久最近最新中文字幕大全,久久国产精品99久久久久久老狼

探究CRC32算法實現原理-why table-driven implemention

探究CRC32算法實現原理-why table-driven implemention

Author : Kevin Lynx
email : zmhn320@163.com

Preface

基于不重造輪子的原則，本文盡量不涉及網絡上遍地都是的資料。

What's CRC ?

簡而言之，CRC是一個數值。該數值被用于校驗數據的正確性。CRC數值簡單地說就是通過讓你需要做
處理的數據除以一個常數而得到的余數。當你得到這個數值后你可以將這個數值附加到你的數據后，
當數據被傳送到其他地方后，取出原始數據(可能在傳送過程中被破壞)與附加的CRC數值，然后將這里
的原始數據除以之前那個常數(約定好的)然后得到新的CRC值。比較兩個CRC值是否相等即可確認你的
數據是否在傳送過程中出現錯誤。

那么，如何讓你的數據除以一個常數？方法是對你的數據進行必要的編碼處理，逐字節處理成數字。
那么這個常數是什么？你不必關注它是什么，也不需要關注它是如何獲得的。當你真的要動手寫一個
CRC的實現算法時，我可以告訴你，CRC的理論學家會告訴你。不同長度的常數對應著不同的CRC實現算法。
當這個常數為32位時，也就是這里所說的CRC32。

以上內容你不必全部理解，因為你需要查閱其他資料來獲取CRC完整的理論介紹。

The mathematics behind CRC ?

很多教科書會把CRC與多項式關聯起來。這里的多項式指的是系數為0或1的式子，例如：
a0 + a1*x + a2*x^2 + ... + an*x^n。其中a0, a1, ..., an要么為0要么為1。我們并不關注x取什么值。
(如果你要關注，你可以簡單地認為x為2) 這里把a0, a1, ..., an的值取出來排列起來，就可以表示比特
流。例如 1 + x + x^3所表示的比特流就為：1101。部分資料會將這個順序顛倒，這個很正常。

什么是生成多項式？

所謂的生成多項式，就是上面我所說的常數。注意，在這里，一個多項式就表示了一個比特流，也就是一堆
1、0，組合起來最終就是一個數值。例如CRC32算法中，這個生成多項式為：
c(x) = 1 + x + x^2 + x^4 + x^5 + x^7 + x^8 + x^10 + x^11 + x^12 + x^16 + x^22 + x^23 + x^26 + x^32。
其對應的數字就為：11101101101110001000001100100000(x^32在實際計算時隱含給出，因此這里沒有包含它
的系數)，也就是0xEDB88320(多項式對應的數字可能顛倒，顛倒后得到的是0x04C11DB7，其實也是正確的)。

由此可以看出，CRC值也可以看成我們的數據除以一個生成多項式而得到的余數。

如何做這個除法？

套用大部分教科書給出的計算方法，因為任何數據都可以被處理成純數字，因此，在某種程度上說，我們可以
直接開始這個除法。盡管事實上這并不是標準的除法。例如，我們的數據為1101011011(方便起見我直接給二進制
表示了，從這里也可以看出，CRC是按bit進行計算的)，給定的生成多項式(對應的值)為10011。通常的教科書
會告訴我們在進行這個除法前，會把我們的數據左移幾位(生成多項式位數-1位)，從而可以容納將來計算得到
的CRC值(我上面所說的將CRC值附加到原始數據后)。但是為什么要這樣做？我也不知道。(不知道的東西不能含糊
而過)那么，除法就為：
            1100001010
       _______________
10011 ) 11010110110000 附加了幾個零的新數據
        10011......... 這里的減法(希望你不至于忘掉小學算術)是一個異或操作
        -----.........
         10011........
         10011........
         -----........
          00001....... 逐bit計算
          00000.......
          -----.......
           00010......
           00000......
           -----......
            00101.....
            00000.....
            -----.....
             01011....
             00000....
             -----....
              10110...
              10011...
              -----...
               01010..
               00000..
               -----..
                10100.
                10011.
                -----.
                 01110
                 00000
                 -----
                  1110 = 這個余數也就是所謂的CRC值，通常又被稱為校驗值。

希望進行到這里，你可以獲取更多關于CRC的感性認識。而我們所要做的，也就是實現一個CRC的計算算法。
說白了，就是提供一個程序，給定一段數據，以及一個生成多項式(對于CRC32算法而言該值固定)，然后
計算得出上面的1110余數。

The simplest algorithm.

最簡單的實現算法，是一種模擬算法。我們模擬上面的除法過程，遵從網上一份比較全面的資料，我們設定
一個變量register。我們逐bit地將我們的數據放到register中。然后判斷register最高位是否為1，如果是
則與生成多項式異或操作，否則繼續處理。這個過程簡單地模擬了上述除法過程：

///

/// The simplest CRC implement algorithm.

///

Load the register with zero bits.

Augment the message by appending W zero bits to the end of it.

While (more message bits)

Begin

Shift the register left by one bit, reading the next bit of the

augmented message into register bit position 0.

If (a 1 bit popped out of the register during step 3)

End

The register now contains the remainder.

#include <stdio.h>

#define POLY 0x13

int main()

{

/// the data

unsigned short data = 0x035b;

/// load the register with zero bits

unsigned short regi = 0x0000;

/// augment the data by appending W(4) zero bits to the end of it.

data <<= 4;

/// we do it bit after bit

for( int cur_bit = 15; cur_bit >= 0; -- cur_bit )

{

/// test the highest bit which will be poped later.

/// in fact, the 5th bit from right is the hightest bit here

if( ( ( regi >> 4 ) & 0x0001 ) == 0x1 )

{

regi = regi ^ POLY;

}

/// shift the register

regi <<= 1;

/// reading the next bit of the augmented data

unsigned short tmp = ( data >> cur_bit ) & 0x0001;

regi |= tmp;

}

/// and now, register contains the remainder which is also called CRC value.

return 0;

}

better algorithm ?

很多時候這種讓人容易理解的算法都不會被實際用到。這種逐bit操作的算法實在很慢。你可能知道
一般的CRC32算法都是一種基于表(table-driven)的算法。但是你可能不知道這個表是如何來的。

一種改善這種bit after bit的方法就是將這個bit擴大，例如典型的做法就是換成byte。這里我要詳細地敘述下
上面那種算法的過程：

我們每次會先檢查register的最高位是否為1，如果為1，則將生成多項式(所謂的Poly)與register進行異或操作。
然后，將register左移一位，也就舍棄了最高位。然后將我們的數據拿一bit出來放到register的最低位。

也就是說，register中的某一位的值會決定后面幾位的值。如果將register最高字節每一bit編碼為：
t7 t6 t5 t4 t3 t2 t1 t0。那么，t7會決定t6-t0的值(如果為1)，t6會決定t5-t0的值，依次類推。但是，無論誰
決定誰的值，當上面那個算法迭代一個字節后(8bits)，t7-t0都會被丟棄(whatever you do)。唯一留下來的東西，
就是對這個字節以后字節的影響。

那么，如果我們可以直接獲取這個影響，我們就可以byte after byte地處理，而不是bit after bit。如何獲取這個
影響呢？這個影響又是什么呢？這個影響就對應著我們的table-driven CRC算法中的表元素！

但是，為什么我們逐bit進行計算的過程為什么可以簡化為一步操作？事實上，我們沒有簡化這個操作。一種用于教學
的算法，是實時地計算這個影響值：

///

/// The table-driven CRC implement algorithm part 1.

///

While (augmented message is not exhausted)

Begin

Examine the top byte of the register

Calculate the control byte from the top byte of the register

Sum all the Polys at various offsets that are to be XORed into

the register in accordance with the control byte

Shift the register left by one byte, reading a new message byte

into the rightmost byte of the register

XOR the summed polys to the register

End

#include <stdio.h>

#include <stdlib.h>

#include <memory.h>

#define POLY 0x04C11DB7L

int main()

{

/// the data

unsigned long data = 0x1011035b;

/// load the register with the data

unsigned long regi = 0;

/// allocate memory to contain the AUGMENTED data (added some zeros)

unsigned char p[8];

/// copy data

memset( p, 0, 8 );

memcpy( p, &data, 4 );

/// because data contains 4 bytes

for( int i = 0; i < 8; ++ i )

{

/// get the top byte of the register

unsigned char top_byte = (unsigned char)( ( regi >> 24 ) & 0xff );

/// sum all the polys at various offsets

unsigned long sum_poly = top_byte << 24;

for( int j = 0; j < 8; ++ j )

{

/// check the top bit

if( ( sum_poly >> 31 ) != 0 )

{

/// TODO : understand why '<<' first

sum_poly = ( sum_poly << 1 ) ^ POLY;

}

else

{

sum_poly <<= 1;

}

/// shift the register left by on byte, reading a new

regi = ( ( regi << 8 ) | p[i] );

/// xor the summed polys to the register

regi ^= sum_poly;

}

/// and now, register contains the remainder which is also called CRC value.

return 0;

}

其中：

/// sum all the polys at various offsets

unsigned long sum_poly = top_byte << 24;

for( int j = 0; j < 8; ++ j )

{

/// check the top bit

if( ( sum_poly >> 31 ) != 0 )

{

/// TODO : understand why '<<' first

sum_poly = ( sum_poly << 1 ) ^ POLY;

}

else

{

sum_poly <<= 1;

}

就是用于計算這個影響值的。事實上，table-driven CRC算法中的那個表就是通過這段代碼生成的(排除其他一些細節)。
你可能并不是很理解，這里我建議你忽略各種細節(更多的細節見參考資料)。你所需要知道的是，我們將8次逐bit的操
作合并到了一次byte操作中。而這個byte操作，就是8次bit操作的合操作(上面提到的影響值)。這個byte操作其實就是
一個數值，也就是table-driven CRC算法中那個表的一個元素。不同序列的bit操作其實對應著不同的unsigned char
值，因此那個table有256個元素。

show me where the table is :

如上所說，上面的算法很容易地就可以引進一個表：

進一步簡化：

上述算法一個典型特征是會在我們的數據后面添加若干0。這樣做其他做了很多沒用的計算。一種簡化做法就是將這些
沒用的計算合并到其他計算中。其實這都是一些位操作的技巧：

///

/// The table-driven CRC implement algorithm part 2.

///

While (augmented message is not exhausted)

Begin

Examine the top byte of the register

Calculate the control byte from the top byte of the register

Sum all the Polys at various offsets that are to be XORed into

the register in accordance with the control byte

Shift the register left by one byte, reading a new message byte

into the rightmost byte of the register

XOR the summed polys to the register

End

#include <stdio.h>

#include <stdlib.h>

#include <memory.h>

#define POLY 0x04C11DB7L

unsigned long get_sum_poly( unsigned char top_byte )

{

/// sum all the polys at various offsets

unsigned long sum_poly = top_byte << 24;

for( int j = 0; j < 8; ++ j )

{

/// check the top bit

if( ( sum_poly >> 31 ) != 0 )

{

/// TODO : understand why '<<' first

sum_poly = ( sum_poly << 1 ) ^ POLY;

}

else

{

sum_poly <<= 1;

}

return sum_poly;

}

void create_table( unsigned long *table )

{

for( int i = 0; i < 256; ++ i )

{

table[i] = get_sum_poly( (unsigned char) i );

}

int main()

{

/// the data

unsigned long data = 0x1011035b;

/// load the register with the data

unsigned long regi = 0;

/// allocate memory to contain the AUGMENTED data (added some zeros)

unsigned char p[8];

/// copy data

memset( p, 0, 8 );

memcpy( p, &data, 4 );

/// the table

unsigned long table[256];

/// create the table

create_table( table );

/// because data contains 4 bytes

for( int i = 0; i < 8; ++ i )

{

/// get the top byte of the register

unsigned char top_byte = (unsigned char)( ( regi >> 24 ) & 0xff );

/// shift the register left by on byte, reading a new

regi = ( ( regi << 8 ) | p[i] );

/// xor the summed polys to the register

regi ^= table[top_byte];

}

/// and now, register contains the remainder which is also called CRC value.

return 0;

}

討厭的附加0

以上算法有個很大的特征就是要為我們的數據附加很多0。附加0后其實也附加了很多無用的操作。我們要將這些
討厭的0去掉：

int main()

{

/// the data

unsigned long data = 0x1011035b;

/// load the register with the data

unsigned long regi = 0;

/// allocate memory to contain the data

unsigned char p[4];

/// copy data

memcpy( p, &data, 4 );

/// the table

unsigned long table[256];

/// create the table

create_table( table );

/// because data contains 4 bytes

for( int i = 0; i < 4; ++ i )

{

regi = ( regi << 8 ) ^ table[ ( regi >> 24 ) ^ p[i] ];

}

/// and now, register contains the remainder which is also called CRC value.

return 0;

}

關鍵的一句regi = ( regi << 8 ) ^ table[ ( regi >> 24 ) ^ p[i] ]; 簡化了很多沒用的操作。

In practice :

似乎一切被我說的很簡單。我想只是因為我沒說清楚。我盡量讓你注意到事情的重點。我們進行到這里，似乎
我們立馬就可以寫出自己的CRC32算法并用于實踐。但是你很快就會發現，事情并不如你想像的那么簡單。

在實際處理時，很多數據的bit會進行一種顛倒操作，例如1010會被顛倒為0101。出現這樣的情況是因為某些硬件
在實現CRC算法時，采用了這種(丑陋的)習慣。有些軟件實現CRC算法時，也延用了這個習慣。

另外，關于register的初始值問題，有些CRC算法會初始化為0xffffffff。以下給出一個會進行bit顛倒的算法，
該算法可以直接輸出table-driven中的表：

///

/// The table-driven CRC implement algorithm part 4.

///

/// Donot need augment W/8 zero bytes.

///

#include <stdio.h>

#include <stdlib.h>

#include <memory.h>

#define POLY 0x04C11DB7L

#define BITMASK(X) (1L << (X))

unsigned long refelect( unsigned long v, int b )

{

int i;

unsigned long t = v;

for( i = 0; i < b; ++ i )

{

if( t & 1L )

v |= BITMASK( (b-1)-i );

else

v &= ~BITMASK( (b-1)-i );

t >>= 1;

}

return v;

}

/// i'll try to write a correct algorithm

unsigned long get_sum_poly( unsigned char byte )

{

byte = (unsigned long) refelect( byte, 8 );

unsigned long sum_poly = byte << 24;

for( int i = 0; i < 8; ++ i )

{

/// check the top bit

if( ( sum_poly >> 31 ) != 0 )

{

/// TODO : understand why '<<' first

sum_poly = ( sum_poly << 1 ) ^ POLY;

}

else

{

sum_poly <<= 1;

}

sum_poly = refelect( sum_poly, 32 );

return sum_poly;

}

void create_table( unsigned long *table )

{

for( int i = 0; i <= 255; ++ i )

{

table[i] = get_sum_poly( (unsigned char) i );

}

void output_table( const unsigned long *table )

{

FILE *fp = fopen( "table.txt", "w" );

for( int y = 0; y < 64; ++ y )

{

fprintf( fp, "0x%08lXL,\t0x%08lXL,\t0x%08lXL,\t0x%08lXL, \n",

table[ y * 4 + 0],

table[ y * 4 + 1],

table[ y * 4 + 2],

table[ y * 4 + 3] );

}

fclose( fp );

}

int main()

{

/// the table

unsigned long table[256];

/// the data

unsigned long data = 0x1011035b;

/// load the register with the data

unsigned long regi = 0;

/// allocate memory to contain the data

unsigned char p[4];

/// copy data

memcpy( p, &data, 4 );

/// create the table

create_table( table );

/// output the table

output_table( table );

/// because data contains 4 bytes

for( int i = 0; i < 4; ++ i )

{

regi = ( regi << 8 ) ^ table[ ( regi >> 24 ) ^ p[i] ];

}

/// and now, register contains the remainder which is also called CRC value.

return 0;

}

Please FORGIVE me

我想我并沒有將整個過程徹底地講清楚。但是我希望你能明白大致的原理。關于table-driven中那個神奇的表的來歷，
關于CRC32算法的推導過程等等之類。

本文代碼下載： http://www.shnenglu.com/Files/kevinlynx/CRC%20Implement.rar

參考資料：
http://www34.brinkster.com/dizzyk/math-crc.asp
http://www.greenend.org.uk/rjk/2004/crc.html
http://www.ross.net/crc/crcpaper.html

posted on 2008-04-01 21:22 Kevin Lynx 閱讀(20320) 評論(13) 編輯收藏引用所屬分類: game develop 、通用編程

為什么
for( int i = 0; i < 8; ++ i )
{
/**//// get the top byte of the register
unsigned char top_byte = (unsigned char)( ( regi >> 24 ) & 0xff );
/**//// shift the register left by on byte, reading a new
regi = ( ( regi << 8 ) | p[i] );
/**//// xor the summed polys to the register
regi ^= table[top_byte];
}

能轉化成:

for( int i = 0; i < 4; ++ i )
{
regi = ( regi << 8 ) ^ table[ ( regi >> 24 ) ^ p[i] ];//???????
} 回復更多評論

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-04-24 10:33 microtiger

好文章，我正在參考。引用參考在所難免，多多學習！回復更多評論

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-05-13 15:56 1984meng

看不懂真的看不懂但這不是理由回復更多評論

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-06-07 23:43 thssld

您的第二段代碼的輸出結果和 HashClash 不相同
但是和我手算得相同
請問是不是因為這些軟件還加入了其他數據？回復更多評論

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-07-07 16:30 litandy

寫的太亂了,根本就是不想讓人看懂,有顯示自己很懂. 回復更多評論

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-07-23 22:57 cctm

好莫名的文章。。回復更多評論

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-11-27 13:27 liangzuolin

終于看懂了,寫的很不錯!沒看懂的朋友可以參考一下其他的更基礎的資料,不過這個程序寫的很不錯! 回復更多評論

# re: 探究CRC32算法實現原理-why table-driven implemention 2012-07-06 15:23 PG

難道是翻譯過來的？這語言組織的有點生硬。回復更多評論

刷新評論列表

只有注冊用戶登錄后才能發表評論。
【推薦】100%開源！大型工業跨平臺軟件C++源碼提供，建模，組態！

相關文章: MMO聊天服務器設計談談我們的游戲邏輯服務器實現（二）談談我們的游戲邏輯服務器實現（一） MMO游戲對象屬性設計網游中的玩家移動游戲資源包簡單設計 SGI STL的內存池突破select的FD_SETSIZE限制 tcp要點學習-斷開連接學生時代做的東西-留個紀念

網站導航: 博客園 IT新聞 BlogJava 博問 Chat2DB 管理

# 應該是不重發明輪子吧？ 2008-04-02 11:24 123

# re: 探究CRC32算法實現原理-why table-driven implemention 2008-07-17 22:01 樹欲靜而風不止

# re: 探究CRC32算法實現原理-why table-driven implemention 2008-07-22 10:22 lirui

# re: 探究CRC32算法實現原理-why table-driven implemention[未登錄] 2008-07-30 16:09 li

# re: 探究CRC32算法實現原理-why table-driven implemention 2008-12-29 12:06 天堂

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-03-22 19:25 lzy

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-04-24 10:33 microtiger

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-05-13 15:56 1984meng

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-06-07 23:43 thssld

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-07-07 16:30 litandy

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-07-23 22:57 cctm

# re: 探究CRC32算法實現原理-why table-driven implemention 2009-11-27 13:27 liangzuolin

# re: 探究CRC32算法實現原理-why table-driven implemention 2012-07-06 15:23 PG

loop_in_codes

導航

統計

公告

常用鏈接

留言簿(52)

隨筆分類

隨筆檔案

收藏夾

C++

關注的開源項目

其他關注

網絡編程

我的項目

搜索

積分與排名

最新評論

閱讀排行榜

評論排行榜

探究CRC32算法實現原理-why table-driven implemention

評論