清風(fēng)竹林

ぷ雪飄絳梅映殘紅
ぷ花舞霜飛映蒼松
----- Do more,suffer less
導(dǎo)航

<
2009年5月
>
日
一
二
三
四
五
六
26
27
28
29
30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
2
3
4
5
6
統(tǒng)計(jì)

隨筆 - 68
文章 - 0
評(píng)論 - 110
引用 - 0
常用鏈接

留言簿(5)

隨筆分類

隨筆檔案

相冊(cè)

picture
TLink

搜索

閱讀排行榜

評(píng)論排行榜

高手戲玩c++

toupper,tolower
地球人都知道 C++ 的 string 沒有 toupper ，好在這不是個(gè)大問題，因?yàn)槲覀冇?STL 算法：

string s("heLLo");
transform(s.begin(), s.end(), s.begin(), ::toupper);
cout << s << endl;
transform(s.begin(), s.end(), s.begin(), ::tolower);
cout << s << endl;

當(dāng)然，我知道很多人希望的是 s.to_upper() ，但是對(duì)于一個(gè)這么通用的 basic_string 來(lái)說(shuō)，的確沒辦法把這些專有的方法放進(jìn)來(lái)。如果你用 boost stringalgo ，那當(dāng)然不在話下，你也就不需要讀這篇文章了。

------------------------------------------------------------------------
trim
我們還知道 string 沒有 trim ，不過自力更生也不困難，比 toupper 來(lái)的還要簡(jiǎn)單：

    string s("   hello   ");
    s.erase(0, s.find_first_not_of(" \n"));
    cout << s << endl;
    s.erase(s.find_last_not_of(' ') + 1);
    cout << s << endl;

注意由于 find_first_not_of 和 find_last_not_of 都可以接受字符串，這個(gè)時(shí)候它們尋找該字符串中所有字符的 absence ，所以你可以一次 trim 掉多種字符。

-----------------------------------------------------------------------
erase
string 本身的 erase 還是不錯(cuò)的，但是只能 erase 連續(xù)字符，如果要拿掉一個(gè)字符串里面所有的某個(gè)字符呢？用 STL 的 erase + remove_if 就可以了，注意光 remove_if 是不行的。

    string s("   hello, world. say bye   ");
    s.erase(remove_if(s.begin(),s.end(),
        bind2nd(equal_to(), ' ')),
    s.end());

上面的這段會(huì)拿掉所有的空格，于是得到 hello,world.saybye。

-----------------------------------------------------------------------
replace
string 本身提供了 replace ，不過并不是面向字符串的，譬如我們最常用的把一個(gè) substr 換成另一個(gè) substr 的操作，就要做一點(diǎn)小組合：

    string s("hello, world");
    string sub("ello, ");
    s.replace(s.find(sub), sub.size(), "appy ");
    cout << s << endl;

輸出為 happy world。注意原來(lái)的那個(gè) substr 和替換的 substr 并不一定要一樣長(zhǎng)。

-----------------------------------------------------------------------
startwith, endwith
這兩個(gè)可真常用，不過如果你仔細(xì)看看 string 的接口，就會(huì)發(fā)現(xiàn)其實(shí)沒必要專門提供這兩個(gè)方法，已經(jīng)有的接口可以干得很好：

    string s("hello, world");
    string head("hello");
    string tail("ld");
    bool startwith = s.compare(0, head.size(), head) == 0;
    cout << boolalpha << startwith << endl;
    bool endwith = s.compare(s.size() - tail.size(), tail.size(), tail) == 0;
    cout << boolalpha << endwith << endl;

當(dāng)然了，沒有 s.startwith("hello") 這樣方便。

------------------------------------------------------------------------
toint, todouble, tobool...
這也是老生常談了，無(wú)論是 C 的方法還是 C++ 的方法都可以，各有特色：

    string s("123");
    int i = atoi(s.c_str());
    cout << i << endl;
   
    int ii;
    stringstream(s) >> ii;
    cout << ii << endl;
   
    string sd("12.3");
    double d = atof(sd.c_str());
    cout << d << endl;
   
    double dd;
    stringstream(sd) >> dd;
    cout << dd << endl;
   
    string sb("true");
    bool b;
    stringstream(sb) >> boolalpha >> b;
    cout << boolalpha << b << endl;

C 的方法很簡(jiǎn)潔，而且賦值與轉(zhuǎn)換在一句里面完成，而 C++ 的方法很通用。

------------------------------------------------------------------------
split
這可是件麻煩事，我們最希望的是這樣一個(gè)接口： s.split(vect, ',') 。用 STL 算法來(lái)做有一定難度，我們可以從簡(jiǎn)單的開始，如果分隔符是空格、tab 和回車之類，那么這樣就夠了：

    string s("hello world, bye.");
    vector vect;
    vect.assign(
        istream_iterator(stringstream(s)),
        istream_iterator()
    );

不過要注意，如果 s 很大，那么會(huì)有效率上的隱憂，因?yàn)?stringstream 會(huì) copy 一份 string 給自己用。

------------------------------------------------------------------------
concat
把一個(gè)裝有 string 的容器里面所有的 string 連接起來(lái)，怎么做？希望你不要說(shuō)是 hand code 循環(huán)，這樣做不是更好？

    vector vect;
    vect.push_back("hello");
    vect.push_back(", ");
    vect.push_back("world");
   
    cout << accumulate(vect.begin(), vect.end(), string(""));

不過在效率上比較有優(yōu)化余地。

-------------------------------------------------------------------------

reverse
其實(shí)我比較懷疑有什么人需要真的去 reverse 一個(gè) string ，不過做這件事情的確是很容易：

  std::reverse(s.begin(), s.end());

上面是原地反轉(zhuǎn)的方法，如果需要反轉(zhuǎn)到別的 string 里面，一樣簡(jiǎn)單：

  s1.assign(s.rbegin(), s.rend());

效率也相當(dāng)理想。

-------------------------------------------------------------------------

解析文件擴(kuò)展名
字?jǐn)?shù)多點(diǎn)的寫法：

    std::string filename("hello.exe");

    std::string::size_type pos = filename.rfind('.');
    std::string ext = filename.substr(pos == std::string::npos ? filename.length() : pos + 1);

不過兩行，合并成一行呢？也不是不可以：

    std::string ext = filename.substr(filename.rfind('.') == std::string::npos ? filename.length() : filename.rfind('.') + 1);

我知道，rfind 執(zhí)行了兩次。不過第一，你可以希望編譯器把它優(yōu)化掉，其次，擴(kuò)展名一般都很短，即便多執(zhí)行一次，區(qū)別應(yīng)該是相當(dāng)微小。
STL 算法
distance
很多時(shí)候我們希望在一個(gè) vector ，或者 list ，或者什么其他東西里面，找到一個(gè)值在哪個(gè)位置，這個(gè)時(shí)候 find 幫不上忙，而有人就轉(zhuǎn)而求助手寫循環(huán)了，而且是原始的手寫循環(huán)：

for ( int i = 0; i < vect.size(); ++i)
    if ( vect[i] == value ) break;

如果編譯器把 i 看作 for scope 的一部分，你還要把 i 的聲明拿出去。真的需要這樣么？看看這個(gè)：

    int dist =
        distance(col.begin(),
            find(col.begin(), col.end(), 5));

其中 col 可以是很多容器，list, vector, deque... 當(dāng)然這是你確定 5 就在 col 里面的情形，如果你不確定，那就加點(diǎn)判斷：

    int dist;
    list::iterator pos = find(col.begin(), col.end(), 5);
    if ( pos != col.end() )
        dist = distance(col.begin(), pos);

我想這還是比手寫循環(huán)來(lái)的好些吧。

--------------------------------------------------------------------------
max, min
這是有直接的算法支持的，當(dāng)然復(fù)雜度是 O(n)，用于未排序容器，如果是排序容器...老兄，那還需要什么算法么？

max_element(col.begin(), col.end());
min_element(col.begin(), col.end());

注意返回的是 iterator ，如果你關(guān)心的只是值，那么好：

*max_element(col.begin(), col.end());
*min_element(col.begin(), col.end());

max_element 和 min_element 都默認(rèn)用 less 來(lái)排序，它們也都接受一個(gè) binary predicate ，如果你足夠無(wú)聊，甚至可以把 max_element 當(dāng)成 min_element 來(lái)用，或者反之：

*max_element(col.begin(), col.end(), greater()); // 返回最小值！
*min_element(col.begin(), col.end(), greater()); // 返回最大值

當(dāng)然它們的本意不是這個(gè)，而是讓你能在比較特殊的情況下使用它們，例如，你要比較的是每個(gè)元素的某個(gè)成員，或者成員函數(shù)的返回值。例如：

#include 
#include


#include 
#include 
#include 

using namespace boost;
using namespace std;

struct Person
{
    Person(const string& _name, int _age)
        : name(_name), age(_age)
    {}
    int age;
    string name;
};

int main()
{
    list

col;
    list

::iterator pos;

    col.push_back(Person("Tom", 10));
    col.push_back(Person("Jerry", 12));
    col.push_back(Person("Mickey", 9));

    Person eldest =
        *max_element(col.begin(), col.end(),
            bind(&Person::age, _1) < bind(&Person::age, _2));//>=1.33
   
    cout << eldest.name;
}

輸出是 Jerry ，這里用了 boost.bind ，原諒我不知道用 bind2nd, mem_fun 怎么寫，我也不想知道...

-------------------------------------------------------------------------
copy_if
沒錯(cuò)，STL 里面壓根沒有 copy_if ，這就是為什么我們需要這個(gè)：

template
OutputIterator copy_if(
    InputIterator begin, InputIterator end, OutputIterator destBegin, Predicate p)
{
    while (begin != end)
    {
        if (p(*begin))*destBegin++ = *begin;
        ++begin;
    }
    return destBegin;
}

把它放在自己的工具箱里，是一個(gè)明智的選擇。

------------------------------------------------------------------------
慣用手法：erase(iter++)
如果你要去除一個(gè) list 中的某些元素，那可千萬(wàn)小心：（下面的代碼是錯(cuò)的！！！）

#include 
#include 
#include 
#include



int main()
{
    int arr[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    std::list lst(arr, arr + 10);

    for ( std::list::iterator iter = lst.begin();
          iter != lst.end(); ++iter)
        if ( *iter % 2 == 0 )
            lst.erase(iter);
           
    std::copy(lst.begin(), lst.end(),
        std::ostream_iterator(std::cout, " "));
}

當(dāng) iter 被 erase 掉的時(shí)候，它已經(jīng)失效，而后面卻還會(huì)做 ++iter ，其行為無(wú)可預(yù)期！如果你不想動(dòng)用 remove_if ，那么唯一的選擇就是：

#include 
#include 
#include 
#include



int main()
{
    int arr[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    std::list lst(arr, arr + 10);

    for ( std::list::iterator iter = lst.begin();
          iter != lst.end(); )
        if ( *iter % 2 == 0 )
            lst.erase(iter++);
        else
            ++iter;
          
    std::copy(lst.begin(), lst.end(),
        std::ostream_iterator(std::cout, " "));
}

但是上面的代碼不能用于 vector, string 和 deque ，因?yàn)閷?duì)于這些容器， erase 不光令 iter 失效，還令 iter 之后的所有 iterator 失效！

-------------------------------------------------------------------------
erase(remove...) 慣用手法
上面的循環(huán)如此難寫，如此不通用，如此不容易理解，還是用 STL 算法來(lái)的好，但是注意，光 remove_if 是沒用的，必須使用 erase(remove...) 慣用手法：

#include 
#include 
#include 
#include


#include 
#include 

int main()
{
    int arr[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    std::list lst(arr, arr + 10);

    lst.erase(remove_if(lst.begin(), lst.end(),
        boost::bind(std::modulus(), _1, 2) == 0),
        lst.end()
    );
          
    std::copy(lst.begin(), lst.end(),
        std::ostream_iterator(std::cout, " "));
}

當(dāng)然，這里借助了 boost.bind ，讓我們不用多寫一個(gè)沒用的 functor 。

簡(jiǎn)單常識(shí)——關(guān)于stream
從文件中讀入一行

簡(jiǎn)單，這樣就行了：

ifstream ifs("input.txt");
char buf[1000];

ifs.getline(buf, sizeof buf);

string input(buf);

當(dāng)然，這樣沒有錯(cuò)，但是包含不必要的繁瑣和拷貝，況且，如果一行超過1000個(gè)字符，就必須用一個(gè)循環(huán)和更麻煩的緩沖管理。下面這樣豈不是更簡(jiǎn)單？

string input;
input.reserve(1000);
ifstream ifs("input.txt");
getline(ifs, input);

不僅簡(jiǎn)單，而且安全，因?yàn)槿趾瘮?shù) getline 會(huì)幫你處理緩沖區(qū)用完之類的麻煩，如果你不希望空間分配發(fā)生的太頻繁，只需要多 reserve 一點(diǎn)空間。

這就是“簡(jiǎn)單常識(shí)”的含義，很多東西已經(jīng)在那里，只是我一直沒去用。

---------------------------------------------------------------------------

一次把整個(gè)文件讀入一個(gè) string

我希望你的答案不要是這樣：

string input;
while( !ifs.eof() )
{
    string line;
    getline(ifs, line);
    input.append(line).append(1, '\n');
}

當(dāng)然了，沒有錯(cuò)，它能工作，但是下面的辦法是不是更加符合 C++ 的精神呢？

string input(
    istreambuf_iterator(instream.rdbuf()),
    istreambuf_iterator()
);

同樣，事先分配空間對(duì)于性能可能有潛在的好處：

string input;
input.reserve(10000);
input.assign(
    istreambuf_iterator(ifs.rdbuf()),
    istreambuf_iterator()
);

很簡(jiǎn)單，不是么？但是這些卻是我們經(jīng)常忽略的事實(shí)。
補(bǔ)充一下，這樣干是有問題的：

    string input;
    input.assign(
        istream_iterator(ifs),
        istream_iterator()
    );

因?yàn)樗鼤?huì)忽略所有的分隔符，你會(huì)得到一個(gè)純“字符”的字符串。最后，如果你只是想把一個(gè)文件的內(nèi)容讀到另一個(gè)流，那沒有比這更快的了：

    fstream fs("temp.txt");
    cout << fs.rdbuf();

因此，如果你要手工 copy 文件，這是最好的（如果不用操作系統(tǒng)的 API）：

   ifstream ifs("in.txt");
   ofstream ofs("out.txt");
   ofs << in.rdbuf();

-------------------------------------------------------------------------

open 一個(gè)文件的那些選項(xiàng)

ios::in     Open file for reading
ios::out    Open file for writing
ios::ate    Initial position: end of file
ios::app    Every output is appended at the end of file
ios::trunc  If the file already existed it is erased
ios::binary Binary mode

-------------------------------------------------------------------------

還有 ios 的那些 flag

    flag 	effect if set
    ios_base::boolalpha 	input/output bool objects as alphabetic names (true, false).
    ios_base::dec 	input/output integer in decimal base format.
    ios_base::fixed 	output floating point values in fixed-point notation.
    ios_base::hex 	input/output integer in hexadecimal base format.
    ios_base::internal 	the output is filled at an internal point enlarging the output up to the field width.
    ios_base::left 	the output is filled at the end enlarging the output up to the field width.
    ios_base::oct 	input/output integer in octal base format.
    ios_base::right 	the output is filled at the beginning enlarging the output up to the field width.
    ios_base::scientific 	output floating-point values in scientific notation.
    ios_base::showbase 	output integer values preceded by the numeric base.
    ios_base::showpoint 	output floating-point values including always the decimal point.
    ios_base::showpos 	output non-negative numeric preceded by a plus sign (+).
    ios_base::skipws 	skip leading whitespaces on certain input operations.
    ios_base::unitbuf 	flush output after each inserting operation.
    ios_base::uppercase 	output uppercase letters replacing certain lowercase letters.

There are also defined three other constants that can be used as masks:

    constant 	value
    ios_base::adjustfield 	left | right | internal
    ios_base::basefield 	dec | oct | hex
    ios_base::floatfield 	scientific | fixed

--------------------------------------------------------------------------

用我想要的分隔符來(lái)解析一個(gè)字符串，以及從流中讀取數(shù)據(jù)

這曾經(jīng)是一個(gè)需要不少麻煩的話題，由于其常用而顯得尤其麻煩，但是其實(shí) getline 可以做得不錯(cuò)：

    getline(cin, s, ';');   
    while ( s != "quit" )
    {
        cout << s << endl;
        getline(cin, s, ';');
    }

簡(jiǎn)單吧？不過注意，由于這個(gè)時(shí)候 getline 只把 ; 作為分隔符，所以你需要用 ;quit; 來(lái)結(jié)束輸入，否則 getline 會(huì)把前后的空格和回車都讀入 s ，當(dāng)然，這個(gè)問題可以在代碼里面解決。

同樣，對(duì)于簡(jiǎn)單的字符串解析，我們是不大需要?jiǎng)佑檬裁?Tokenizer 之類的東西了：

#include 
#include 
#include 

using namespace std;

int main()
{
    string s("hello,world, this is a sentence; and a word, end.");
    stringstream ss(s);
   
    for ( ; ; )
    {
        string token;
        getline(ss, token, ',');
        if ( ss.fail() ) break;
       
        cout << token << endl;
    }
}

輸出：

hello
world
 this is a sentence; and a word
 end.

很漂亮不是么？不過這么干的缺陷在于，只有一個(gè)字符可以作為分隔符。

--------------------------------------------------------------------------

把原本輸出到屏幕的東西輸出到文件，不用到處去把 cout 改成 fs
#include 
#include 
using namespace std;
int main()
{    
    ofstream outf("out.txt"); 
    streambuf *strm_buf=cout.rdbuf();    
    cout.rdbuf(outf.rdbuf()); 
    cout<<"write something to file"<
#include 
#include 
#include 
#include 
#include 

using namespace std;

int main()
{  
    vector vect;
    for ( int i = 1; i <= 9; ++i )
        vect.push_back(i);
       
    copy(vect.begin(), vect.end(),
        ostream_iterator(cout, " ")
    );
    cout << endl;
   
    ostream_iterator os_iter(cout, " ~ ");
    *os_iter = 1.0;
    os_iter++;
    *os_iter = 2.0;
    *os_iter = 3.0;
}

輸出：

1 2 3 4 5 6 7 8 9
1 ~ 2 ~ 3 ~

很明顯，ostream_iterator 的作用就是允許對(duì) stream 做 iterator 的操作，從而讓算法可以施加于 stream 之上，這也是 STL 的精華。與前面的“讀取文件”相結(jié)合，我們得到了顯示一個(gè)文件最方便的辦法：

    copy(istreambuf_iterator(ifs.rdbuf()),
         istreambuf_iterator(),
         ostreambuf_iterator(cout)
    );

同樣，如果你用下面的語(yǔ)句，得到的會(huì)是沒有分隔符的輸出：

    copy(istream_iterator(ifs),
         istream_iterator(),
         ostream_iterator(cout)
    );

那多半不是你要的結(jié)果。如果你硬是想用 istream_iterator 而不是 istreambuf_iterator 呢？還是有辦法：

    copy(istream_iterator(ifs >> noskipws),
         istream_iterator(),
         ostream_iterator(cout)
    );

但是這樣不是推薦方法，它的效率比第一種低不少。
如果一個(gè)文件 temp.txt 的內(nèi)容是下面這樣，那么我的這個(gè)從文件中把數(shù)據(jù)讀入 vector 的方法應(yīng)該會(huì)讓你印象深刻。

12345 234 567
89    10

程序：

#include 
#include 
#include 
#include 
#include 

using namespace std;

int main()
{  
    ifstream ifs("temp.txt");
   
    vector vect;
    vect.assign(istream_iterator(ifs),
        istream_iterator()
    );

    copy(vect.begin(), vect.end(), ostream_iterator(cout, " "));
}

輸出：

12345 234 567 89 10

很酷不是么？判斷文件結(jié)束、移動(dòng)文件指針之類的苦工都有 istream_iterator 代勞了。

-----------------------------------------------------------------------

其它算法配合 iterator

計(jì)算文件行數(shù)：

    int line_count =
        count(istreambuf_iterator(ifs.rdbuf()),
              istreambuf_iterator(),
              '\n');       

當(dāng)然確切地說(shuō)，這是在計(jì)算文件中回車符的數(shù)量，同理，你也可以計(jì)算文件中任何字符的數(shù)量，或者某個(gè) token 的數(shù)量：

    int token_count =
        count(istream_iterator(ifs),
              istream_iterator(),
              "#include");       

注意上面計(jì)算的是 “#include” 作為一個(gè) token 的數(shù)量，如果它和其他的字符連起來(lái)，是不算數(shù)的。

------------------------------------------------------------------------
Manipulator

Manipulator 是什么？簡(jiǎn)單的說(shuō)，就是一個(gè)接受一個(gè) stream 作為參數(shù)，并且返回一個(gè) stream 的函數(shù)，比如上面的 unskipws ，它的定義是這樣的：

  inline ios_base&
  noskipws(ios_base& __base)
  {
    __base.unsetf(ios_base::skipws);
    return __base;
  }

這里它用了更通用的 ios_base 。知道了這一點(diǎn)，你大概不會(huì)對(duì)自己寫一個(gè) manipulator 有什么恐懼感了，下面這個(gè)無(wú)聊的 manipulator 會(huì)忽略 stream 遇到第一個(gè)分號(hào)之前所有的輸入（包括那個(gè)分號(hào)）：

template 
inline std::basic_istream&
ignoreToSemicolon (std::basic_istream& s)
{
    s.ignore(std::numeric_limits::max(), s.widen(';'));
    return s;
}

不過注意，它不會(huì)忽略以后的分號(hào)，因?yàn)?ignore 只執(zhí)行了一次。更通用一點(diǎn)，manipulator 也可以接受參數(shù)的，下面這個(gè)就是 ignoreToSemicolon 的通用版本，它接受一個(gè)參數(shù)， stream 會(huì)忽略遇到第一個(gè)該參數(shù)之前的所有輸入，寫起來(lái)稍微麻煩一點(diǎn)：

struct IgnoreTo {
    char ignoreTo;
    IgnoreTo(char c) : ignoreTo(c)
    {}
};
   
std::istream& operator >> (std::istream& s, const IgnoreTo& manip)
{
    s.ignore(std::numeric_limits::max(), s.widen(manip.ignoreTo));
    return s;
}

但是用法差不多：

    copy(istream_iterator(ifs >> noskipws >> IgnoreTo(';')),
         istream_iterator(),
         ostream_iterator(cout)
    );

其效果跟 IgnoreToSemicolon 一樣。
posted on 2009-05-15 15:42 李現(xiàn)民閱讀(626) 評(píng)論(0) 編輯收藏引用所屬分類: 絕對(duì)盜版
只有注冊(cè)用戶登錄后才能發(fā)表評(píng)論。


相關(guān)文章: VC/MFC之ListCtrl控件使用經(jīng)驗(yàn)總結(jié)(轉(zhuǎn)) c語(yǔ)言 printf()輸出格式控制(轉(zhuǎn)) Visual Studio統(tǒng)計(jì)有效代碼行數(shù)(轉(zhuǎn)) std::string is contiguous (轉(zhuǎn)) C++多態(tài)技術(shù)（轉(zhuǎn)）冒泡和選擇排序該被踢出教材了(轉(zhuǎn)) 關(guān)于常量折疊(轉(zhuǎn)) 一個(gè)Sqrt函數(shù)引發(fā)的血案（轉(zhuǎn)）奔騰指令速查手冊(cè)(轉(zhuǎn)) 語(yǔ)言的歧義（轉(zhuǎn)）

網(wǎng)站導(dǎo)航: 博客園 IT新聞 BlogJava 博問 Chat2DB 管理
青青草原综合久久大伊人导航_色综合久久天天综合_日日噜噜夜夜狠狠久久丁香五月_热久久这里只有精品

清風(fēng)竹林

導(dǎo)航

統(tǒng)計(jì)

常用鏈接

留言簿(5)

隨筆分類

隨筆檔案

相冊(cè)

TLink

搜索

最新評(píng)論

閱讀排行榜

評(píng)論排行榜

高手戲玩c++