• <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>

            看到有前輩寫了一個UTF-8與UNICODE相互轉換的代碼,順便提一下,希望可以給大家提供一點幫助.
            下面是一些編碼格式的bit長

            Examples of fixed-width encoding forms:

            Type Each character
            encoded as
            Notes
              7-bit a single 7-bit quantity example: ISO 646
              8-bit G0/G1 a single 8-bit quantity with constraints on use of C0 and C1 spaces
              8-bit a single 8-bit quantity with no constraints on use of C1 space
              8-bit EBCDIC a single 8-bit quantity with the EBCDIC conventions rather than ASCII conventions
            16-bit (UCS-2) a single 16-bit quantity within a code space of 0..FFFF
            32-bit (UCS-4) a single 32-bit quantity within a code space 0..7FFFFFFF
            32-bit (UTF-32) a single 32-bit quantity within a code space of 0..10FFFF
            16-bit DBCS process code a single 16-bit quantity example: UNIX widechar implementations of Asian CCS's
            32-bit DBCS process code a single 32-bit quantity example: UNIX widechar implementations of Asian CCS's
            DBCS Host two 8-bit quantities following IBM host conventions

            Examples of variable-width encoding forms:

            Name Characters are encoded as Notes
            UTF-8 a mix of one to four 8-bit code units in Unicode
            and one to six code units in 10646
            used only with Unicode/10646
            UTF-16 a mix of one to two 16 bit code units used only with Unicode/10646

            Boost中提供了一個UTF-8 Codecvt Facet,可以在utf8和UCS-4(Unicode-32)之間轉換.
            使用方式如下

              //...
              // My encoding type
              typedef wchar_t ucs4_t;

              std::locale old_locale;
              std::locale utf8_locale(old_locale,new utf8_codecvt_facet<ucs4_t>);

              // Set a New global locale
              std::locale::global(utf8_locale);

              //  UCS-4 轉換為 UTF-8
              {
                std::wofstream ofs("data.ucd");
                ofs.imbue(utf8_locale);
                std::copy(ucs4_data.begin(),ucs4_data.end(),
                      std::ostream_iterator<ucs4_t,ucs4_t>(ofs));
              }

              // 讀入 UTF-8 ,轉換為 UCS-4 
              std::vector<ucs4_t> from_file;
              {
                std::wifstream ifs("data.ucd");
                ifs.imbue(utf8_locale);
                ucs4_t item = 0;
                while (ifs >> item) from_file.push_back(item);
              }
              //...
            UTF-8 Codecvt Facet詳見
            http://www.boost.org/libs/serialization/doc/codecvt.html

            posted on 2006-02-15 17:19 張沈鵬 閱讀(2662) 評論(2)  編輯 收藏 引用
            Comments
            • # re: Boost:UTF-8 Codecvt Facet(unicode 和 utf-8 之間相互轉碼)
              無名高手
              Posted @ 2006-02-24 14:18
              無知少年,看你好學,指點你一下吧

              Unicode Technical Report #17 Character Encoding Mode
              http://www.unicode.org/unicode/reports/tr17

              至于更高層次(更簡單)的要訣,呵呵,不告訴你~~  回復  更多評論   
            • # re: Boost:UTF-8 Codecvt Facet(unicode 和 utf-8 之間相互轉碼)
              張沈鵬
              Posted @ 2006-02-25 10:01
              至于更高層次(更簡單)的要訣是什么?望高手指教,謝謝  回復  更多評論   
             
            五月丁香综合激情六月久久| 久久亚洲国产最新网站| 2021久久国自产拍精品| 久久99中文字幕久久| 伊人久久大香线蕉综合网站| 亚洲精品无码专区久久久| 国产91久久综合| 亚洲人成精品久久久久| 国产午夜精品理论片久久影视| 26uuu久久五月天| 久久久无码精品亚洲日韩蜜臀浪潮| 久久99精品久久久久久久久久 | 国产成人99久久亚洲综合精品| 久久er国产精品免费观看8| 久久香综合精品久久伊人| 亚洲第一永久AV网站久久精品男人的天堂AV | 久久强奷乱码老熟女网站| 国产精品毛片久久久久久久| 综合久久给合久久狠狠狠97色| 狠狠狠色丁香婷婷综合久久五月| 欧美日韩久久中文字幕| 看全色黄大色大片免费久久久| 精品九九久久国内精品| 久久精品国产99久久无毒不卡| 狠狠精品干练久久久无码中文字幕| 久久发布国产伦子伦精品| 中文字幕久久久久人妻| 久久乐国产综合亚洲精品| 性做久久久久久久久| 久久九九免费高清视频| 国产一区二区精品久久岳| 91久久精品国产免费直播| 久久精品国产91久久麻豆自制 | 久久99久久成人免费播放| 9191精品国产免费久久| 久久er国产精品免费观看2| 狠狠色丁香婷婷久久综合不卡| 精品精品国产自在久久高清 | 久久久久久久久久久久久久 | 久久丫精品国产亚洲av| 99国产精品久久|