色综合久久天天综线观看,久久99精品国产麻豆,日韩精品久久久久久久电影蜜臀

基于特定小范圍關鍵字的原位統計排序

基本原理
在數據輸入隨機分布的情況下，快速排序具有較好的性能表現，但當元素個數比其關鍵字的取值范圍大，而這個范圍相對較小時，使用一種關鍵字索引統計排序會快很多，因為它的時間復雜度是線性的，基本原理是使用數組（為描述方便，特稱統計數組），其下標對應關鍵字的值，存儲元素按待排序關鍵字的值統計的出現次數，然后再按元素關鍵字的值，結合統計數組，放回到最終位置上。常規的實現是使用一個輔助空間來有序存儲原來的所有元素，再從輔助空間拷貝到原空間，而這個輔助空間大小至少和原空間一樣大，因此這種實現很耗內存，它的好處在于算法是穩定的。為了減少內存占用和拷貝開銷，進一步提高性能，本文講述一種改進的方案，僅在原空間上移動數據就能達到排序效果，但犧牲了穩定性。為方便下文的討論描述，設關鍵字取值范圍為KeyRange，值域是閉區間[min,max]，min為最小值，max為最大值，統計數組為CntArray。顯然，這里關鍵字的類型是整型，在標準c/c++中，整型至少有char,unsigned char,short,unsigned short,int,unsigned int,long,unsigned long 8種，對于有符號整型，KeyRange中的min和（或）max會取負值，所以要映射到CntArray的下標0到max-min之間。

設計實現
在算法設計封裝上，為了支持在編譯期或運行期給出關鍵字min和max的值，每種元素類型的實現有兩種函數版本：對于運行期版本，考慮到當min和max的實際取值在較小的范圍內，統計數組可以在棧而不是堆上分配，這樣有助于一定的性能提升，因而使用了boost中的auto_buffer組件，為方便靈活配置auto_buffer的內部靜態空間大小，因此在函數外部提供了接口以輸入這個閾值（常量模板參數）；對于編譯期版本，為了代碼重用，內部則是調用對應的運行期版本來實現的。關于統計數組的運用，其個數和空間大小則取決于因元素類型而不同的實現，這里元素類型分為三種：基本整型、類類型和指針類型，下面就這三種類型分別詳述對應的算法實現。

實現算法的函數名稱為statistic_sort，對于不同的元素類型，函數所帶的參數不同，類類型和指針類型比基本整型多了一個獲取關鍵字方法的參數，這里的參數包括模板類型形參和函數調用形參，其內部實現則調用了輔助函數__init_index_array和__aux_index_sort。
1）基本整型：在這種情況下，元素本身就是關鍵字，CntArray每項存儲的是下標值對應的元素出現的次數，空間大小是max-min+1，對應的運行期版本實現如下

template<class _KeyT,class _RandIt>
2

void __init_index_array(_RandIt _first,_RandIt _last,_KeyT _min,_KeyT _max,size_t* cnt)
3

{
4

assert(_min <= _max);
5

std::fill_n(cnt,_max-_min+1,0);
6

}
7

template<class _KeyT,class _RandIt>
9

void __aux_index_sort(_RandIt _first,_RandIt _last,_KeyT _min,_KeyT _max,size_t* cnt)
10

{
11

typedef typename std::iterator_traits<_RandIt>::value_type value_type;
12

BOOST_STATIC_ASSERT((boost::is_same<_KeyT,value_type>::value));
13

for (_RandIt it = _first; it != _last; ++it)
15

{
16

assert(_min <= *it && *it <= _max);
17

++cnt[*it-_min];
18

}
19

for (size_t i = 0,j = 0; i < _max - _min + 1; ++i)
20

{
21

while(cnt[i]--) *(_first + (j++)) = i + _min;
22

}
23

}
24

template<class _KeyT,size_t N,class _RandIt>
26

void statistic_sort(_RandIt _first,_RandIt _last,_KeyT _min,_KeyT _max)
27

{
28

using namespace boost::signals2::detail;
29

auto_buffer<size_t,store_n_objects<N> > cnt(_max-_min+1);
30

__init_index_array<_KeyT>(_first,_last,_min,_max,cnt.data());
32

__aux_index_sort<_KeyT>(_first,_last,_min,_max,cnt.data());
33

}

編譯期版本如下

template<class _KeyT,_KeyT _min,_KeyT _max,class _RandIt>
2

inline void __init_index_array(_RandIt _first,_RandIt _last,size_t* cnt)
3

{
4

BOOST_STATIC_ASSERT(_min<=_max);
5

__init_index_array(_first,_last,_min,_max,cnt);
6

}
7

template<class _KeyT,_KeyT _min,_KeyT _max,class _RandIt>
9

void __aux_index_sort(_RandIt _first,_RandIt _last,size_t* cnt)
10

{
11

BOOST_STATIC_ASSERT(_min<=_max);
12

__aux_index_sort(_first,_last,_min,_max,cnt);
13

}
14

template<class _KeyT,_KeyT _min,_KeyT _max,class _RandIt>
16

void statistic_sort(_RandIt _first,_RandIt _last)
17

{
18

size_t cnt[_max-_min+1];
19

__init_index_array<_KeyT,_min,_max>(_first,_last,cnt);
21

__aux_index_sort<_KeyT,_min,_max>(_first,_last,cnt);
22

}

從上面可看出，有兩處使用了BOOST_STATIC_ASSERT宏在編譯期檢測減免不必要的錯誤，一是診斷關鍵字的取值范圍是否合理，二是診斷迭代器指向的元素和關鍵字類型是否相同，這一技術在下面的代碼中也多次用到。
2）類類型：在這種情況下，用到了兩個CntArray，一個是起始數組Beg，空間大小為max-min+2（其實下標max-min+1用不上），每項存儲的是比對應關鍵字小的所有元素出現的次數，也就是關鍵字在序列中的起始索引；另一個是結束數組End，每項存儲的是對應關鍵字的結束索引。這兩個數組用來判斷元素是否處于最終正確的位置上，如果是，那么就不移動元素，否則，要移動元素到End指示的位置上，在移動前，為防止覆蓋目標位置上的元素，需臨時保存目標位置處的元素和該元素對應的目標位置，繼續這個過程，直到目標位置等于當前位置時停止。不同于基本整型，元素的關鍵字是元素的內部成員，需要通過一種方法來獲取，而這種方法可以是自由普通函數，也可以是類的成員函數，還可以是函數對象，只要最終能滿足調用規則即可，這個調用規則是返回關鍵字類型，帶1個類型為元素常量引用的參數。對應的運行期版本實現如下

template<class _KeyT,class _RandIt,class _Func>
2

void __init_index_array(_RandIt _first,_RandIt _last,_KeyT _min,_KeyT _max,size_t* cnt_beg,size_t* cnt_end,_Func _getKey)
3

{
4

typedef typename std::iterator_traits<_RandIt>::value_type value_type;
5

typedef typename boost::remove_pointer<_Func>::type F;
7

typedef typename boost::function_traits<F>::result_type result_type;
8

typedef typename boost::function_traits<F>::arg1_type arg_type;
9

typedef typename boost::add_const<value_type>::type const_type;
10

typedef typename boost::add_reference<const_type>::type const_reference;
11

BOOST_STATIC_ASSERT((boost::is_same<result_type,_KeyT>::value));
13

BOOST_STATIC_ASSERT((boost::is_same<arg_type,const_reference>::value));
14

size_t N = _max - _min + 1;
16

std::fill_n(cnt_beg,N + 1,0);
17

for (_RandIt it = _first; it != _last; ++it)
19

{
20

assert(_min <= _getKey(*it) && _getKey(*it) <= _max);
21

++cnt_beg[_getKey(*it) - _min + 1];
22

}
23

for (size_t i = 1;i < N;++i)
24

{
25

cnt_beg[i] += cnt_beg[i - 1];
26

}
27

std::copy(cnt_beg,cnt_beg + N,cnt_end);
28

}
29

template<class _KeyT,class _RandIt,class _Func>
31

void __aux_index_sort(_RandIt _first,_RandIt _last,_KeyT _min,_KeyT _max,size_t* cnt_beg,size_t* cnt_end,_Func _getKey)
32

{
33

typedef typename std::iterator_traits<_RandIt>::value_type value_type;
34

typedef typename boost::remove_pointer<_Func>::type F;
36

typedef typename boost::function_traits<F>::result_type result_type;
37

typedef typename boost::function_traits<F>::arg1_type arg_type;
38

typedef typename boost::add_const<value_type>::type const_type;
39

typedef typename boost::add_reference<const_type>::type const_reference;
40

BOOST_STATIC_ASSERT((boost::is_same<result_type,_KeyT>::value));
42

BOOST_STATIC_ASSERT((boost::is_same<arg_type,const_reference>::value));
43

size_t beg,end,pos,k;
45

value_type val,tmp_val;
46

_KeyT key,tmp_key;
47

for (_RandIt it = _first; it != _last; ++it)
49

{
50

val = *it; key = _getKey(val);
51

assert(_min <= key && key <= _max);
52

beg = cnt_beg[key-_min],end = cnt_end[key-_min],pos = std::distance(_first,it);
54

if (pos < beg)
55

{
56

for (k = end; k != pos; k = cnt_end[tmp_key-_min],val = tmp_val)
57

{
58

tmp_val = *(_first + k); tmp_key = _getKey(tmp_val);
59

*(_first + k) = val; ++cnt_end[_getKey(val)-_min];
60

}
61

*(_first + k) = val; ++cnt_end[_getKey(val)-_min];
62

}
63

else if (pos == end)
64

{
65

++cnt_end[key-_min];
66

}
67

else if (pos > end)
68

assert(false);
69

}
70

}
71

template<class _KeyT,size_t N,class _RandIt,class _Func>
73

void statistic_sort(_RandIt _first,_RandIt _last,_KeyT _min,_KeyT _max,_Func _getKey)
74

{
75

size_t len = _max - _min + 1;
76

using namespace boost::signals2::detail;
77

auto_buffer<size_t,store_n_objects<N> > cnt_beg(len + 1),cnt_end(len);
78

__init_index_array(_first,_last,_min,_max,cnt_beg.data(),cnt_end.data(),_getKey);
80

__aux_index_sort(_first,_last,_min,_max,cnt_beg.data(),cnt_end.data(),_getKey);
81

}

以上代碼有兩處用到了boost中的function_traits，一是來獲取_getKey的返回類型，二是來獲取_getKey的參數類型，并用BOOST_STATIC_ASSERT來診斷它們是否和預期的類型相同。由于function_traits尚不支持函數對象，為了支持使用類的成員函數這一方式來獲取元素關鍵字，所以這里對它進行了特化，代碼如下

namespace boost
2

{
3

template<class _Result,class _Ty>
4

struct function_traits<std::const_mem_fun_ref_t<_Result,_Ty> >
5

{
6

typedef _Result result_type;
7

typedef _Ty const& arg1_type;
8

};
9

template<class _Result,class _Ty>
10

struct function_traits<std::const_mem_fun_t<_Result,_Ty> >
11

{
12

typedef _Result result_type;
13

typedef _Ty* const& arg1_type;
14

};
15

}

編譯期版本實現如下

template<class _KeyT,_KeyT _min,_KeyT _max,class _RandIt,class _Func>
2

void __init_index_array(_RandIt _first,_RandIt _last,size_t* cnt_beg,size_t* cnt_end,_Func _getKey)
3

{
4

BOOST_STATIC_ASSERT(_min <= _max);
5

__init_index_array(_first,_last,_min,_max,cnt_beg,cnt_end,_getKey);
6

}
7

template<class _KeyT,_KeyT _min,_KeyT _max,class _RandIt,class _Func>
9

void __aux_index_sort(_RandIt _first,_RandIt _last, size_t* cnt_beg,size_t* cnt_end,_Func _getKey)
10

{
11

BOOST_STATIC_ASSERT(_min <= _max);
12

__aux_index_sort(_first,_last,_min,_max,cnt_beg,cnt_end,_getKey);
13

}
14

template<class _KeyT,_KeyT _min,_KeyT _max,class _RandIt,class _Func>
16

void statistic_sort(_RandIt _first,_RandIt _last,_Func _getKey)
17

{
18

size_t cnt_beg[_max - _min + 2], cnt_end[_max - _min + 1];
19

__init_index_array<_KeyT,_min,_max>(_first,_last,cnt_beg,cnt_end,_getKey);
21

__aux_index_sort<_KeyT,_min,_max>(_first,_last,cnt_beg,cnt_end,_getKey);
22

}

3）指針類型：包括基本整型和類類型指針，指針指向真正的元素，在這種情況下，其實現和類類型一樣，只是獲取關鍵字函數的參數類型為指針常量引用而已。
從以上全部實現可以看出，算法并沒有限制關鍵字的取值范圍，而是由_min和_max決定，也就是說所謂的特定小范圍依賴于實際應用，而這個范圍的上限，理論上則取決于數組空間可支持的最大大小，只不過當范圍相對元素個數較小時，具有良好的性能。

應用示例

//自定義類型
2

struct Item
3

{
4

int key;
5

int other;
6

int GetKey() const
8

{
9

return key;
10

}
11

};
12

//自由普通函數定義
13

inline int GetObjKey(Item const& item)
14

{
15

return item.key;
16

}
17

inline int GetPtrKey(Item* const& item)
18

{
19

return item->key;
20

}
21

//支持數組,元素類型為int
23

int p[100000];
24

//在編譯期指定關鍵字范圍
25

statistic_sort<int,-32768,32767>(p,p+sizeof(p)/sizeof(p[0]));
26

//在運行期指定關鍵字范圍
27

statistic_sort<int,65536>(p,p+sizeof(p)/sizeof(p[0]),-32768,32767);
28

//支持容器
30

//元素類型為Item*,成員函數獲取關鍵字,在運行期指定關鍵字范圍
31

vector<Item*> v1;
32

statistic_sort<int,65536>(v1.begin(),v1.end(),-32768,32767,std::mem_fun(&Item::GetKey));
33

//元素類型為Item,成員函數獲取關鍵字,在編譯期指定關鍵字范圍
35

vector<Item> v2;
36

statistic_sort<int,-32768,32767>(v2.begin(),v2.end(),std::mem_fun_ref(&Item::GetKey));
37

//自由普通函數獲取關鍵字
39

//元素類型為Item,在運行期指定關鍵字范圍
40

vector<Item> v3;
41

statistic_sort<int,65536>(v3.begin(),v3.end(),-32768,32767,GetObjKey);
42

//元素類型為Item*,在編譯期指定關鍵字范圍
43

vector<Item*> v4;
44

statistic_sort<int,-32768,32767>(v4.begin(),v4.end(),GetPtrKey);

   以上所有代碼在vc2008和gcc4.6下編譯通過，測試結果表明：
   ● 當元素類型為基本整型，且個數大于關鍵字取值范圍較多時，statistic_sort比stl::sort快很多倍，用數組存儲元素的比容器快。
   ● 當元素類型為類類型和指針類型時，性能還取決于獲取關鍵字方法的實現，當機器直接支持高效獲取關鍵字時（如位運算），則比快速排序要快，而用自由普通函數方式獲取關鍵字的比類成員函數要快。
   ● 所有編譯期版本比運行期要快。

posted on 2012-05-31 12:11 春秋十二月閱讀(1811) 評論(0) 編輯收藏引用所屬分類: Algorithm

只有注冊用戶登錄后才能發表評論。
【推薦】100%開源！大型工業跨平臺軟件C++源碼提供，建模，組態！

相關文章: 一個歐拉數整除問題的兩種證法有限域上的特征與指數和之擴展二元二次型的相似變換、正定性與正交分解關于群的一些結論及應用不定方程的代數數論解法關于橢圓曲線的驗證計算不可約多項式判別算法的改正論證有限域上平方根的求解求解離散對數問題的Terr算法簡單私鑰加密構造的驗證及安全性分析

網站導航: 博客園 IT新聞 BlogJava 博問 Chat2DB 管理

常用鏈接

留言簿(75)

隨筆分類(160)

隨筆檔案(161)

文章分類(30)

關注的開源項目

最新隨筆

積分與排名

最新評論

閱讀排行榜

評論排行榜