C, 有方法可以動態把文字以 UTF-8 發出去嗎 ? |
答題得分者是:aftcast
|
xiaolaba
一般會員 發表:10 回覆:17 積分:5 註冊:2010-05-15 發送簡訊給我 |
俺寫了一段程序, 目的是要對方看到收到 中文字 或者 韓文字
請教的問題, 1) 怎樣把字串轉換 UTF-8 2) 怎樣把字ANSI string 串接 UTF-8 轉換後的 string 背景, 借用了一個網頁的工具, http://www.endmemo.com/unicode/unicodeconverter.php 測試, 例如, 把 [你好] 輸入, 得到這兩個字的 UTF-8 編碼, 然後直接寫死在源嗎裡面, 對方的確可以收到及正確的顯示 編譯環境 Visual Studio 2012, 設定 使用多位元組字元集 [你好] 這兩個字, UTF-8 的字串編碼 = [E4 BD A0 E5 A5 BD] 以下的程式碼已經讓對方, 可以收到及看到正確的顯示 [你好] 兩個字 [code cpp] char UTF_8_how_r_u_hard_coded[7] = {0xe4, 0xbd, 0xa0, 0xe5, 0xa5, 0xbd, 0x00}; //UTF-8 encoded 你好\0, nul string wchar_t UTF_8_how_r_u_string[] = L"你好\0"; char string_buffer [256]; strcat_s (string_buffer, "hello Sir "); strcat_s (string_buffer, UTF_8_how_r_u_hard_coded); //strcat_s (string_buffer, UTF_8_how_r_u_string); //編譯錯誤 SMS_send ("send username1", string_buffer); //"send username1" 必須是 ANSI string, string_buffer 必須是 ANSI string [/code] 但是如果改成以下, 程序並無法編譯, 送出去的 CHAR 也沒有 那串 0xe4, 0xbd, 0xa0, 0xe5, 0xa5, 0xbd, 0x00, 當然對方看不到正確的顯示 把編譯環境 Visual Studio 2012, 設定改用 使用多位元組字元集 或 UNICODE 都一樣 [code cpp] char UTF_8_how_r_u_hard_coded[7] = {0xe4, 0xbd, 0xa0, 0xe5, 0xa5, 0xbd, 0x00}; //UTF-8 encoded 你好\0, nul string wchar_t UTF_8_how_r_u_string[] = L"你好\0"; char string_buffer [256]; strcat_s (string_buffer, "hello Sir "); strcat_s (string_buffer, UTF_8_how_r_u_hard_coded); //strcat_s (string_buffer, UTF_8_how_r_u_string); //編譯錯誤 strcat_s (string_buffer, "你好\0"); //對方看不到這兩個字 SMS_send ("send username1", string_buffer); //"send username1" 必須是 ANSI string\0, string_buffer 必須是 ANSI string\0 [/code]
------
http://xiaolaba.wordpress.com 編輯記錄
xiaolaba 重新編輯於 2013-04-08 15:15:17, 註解 無‧
|
aftcast
站務副站長 發表:81 回覆:1485 積分:1763 註冊:2002-11-21 發送簡訊給我 |
請參考我的範例
[code cpp] /* 蕭沖的範例 20130409 */ wchar_t *pwszUnicode; char *pszUTF8 = NULL; char *pszUTF8_from_ansi = NULL; int iLen = 0; // for pszUTF8 int iLen2 = 0; // for pszUTF8_from_ansi // UTF16 -> UTF8 pwszUnicode = L"你好"; iLen = WideCharToMultiByte(CP_UTF8, 0, (PWSTR) pwszUnicode, -1, NULL, 0, NULL, NULL); pszUTF8 = new char[iLen]; WideCharToMultiByte(CP_UTF8, 0, (PWSTR) pwszUnicode, -1, pszUTF8, iLen, NULL, NULL); // Purpose: from ANSI -> UTF16 -> UTF8 // ANSI -> UTF16 char *pszAnsi = ", I'm Xiao Chong, 蕭沖"; // NOTE HERER !! if pszAnsi isn't CP_ACP, you have to change the correct one // for example if pszAnsi contain simplify chinese, pls use 936 int i = MultiByteToWideChar (CP_ACP, 0, pszAnsi, -1, NULL,0) ; pwszUnicode = new wchar_t[i]; MultiByteToWideChar (CP_ACP, 0, pszAnsi, -1, pwszUnicode,i); // UTF16 -> UTF8 iLen2 = WideCharToMultiByte(CP_UTF8, 0, (PWSTR) pwszUnicode, -1, NULL, 0, NULL, NULL); pszUTF8_from_ansi = new char[iLen2]; WideCharToMultiByte(CP_UTF8, 0, (PWSTR) pwszUnicode, -1, pszUTF8_from_ansi, iLen2, NULL, NULL); //----------Finish ANSI -> UTF8-------------- // below try to cancat 2 UTF8 string int iTotal = iLen iLen2-1; // -1 means originally conatain 2 NULL char *utf8_result = new char[iTotal]; memset(utf8_result,0,iTotal); strcat_s (utf8_result, iTotal, pszUTF8); strcat_s (utf8_result, iTotal, pszUTF8_from_ansi); // test result from C builder 20009 onward only // please remove below test if you are NOT using CB #if __CODEGEARC__ > 0x0593 UTF8String u8 = utf8_result; UnicodeString u16 = u8; // auto converting function by RTL ShowMessage(u16); #endif // end test delete [] pszUTF8; delete [] pwszUnicode; delete [] pszUTF8_from_ansi; [/code]
------
蕭沖 --All ideas are worthless unless implemented-- C++ Builder Delphi Taiwan G+ 社群 http://bit.ly/cbtaiwan |
xiaolaba
一般會員 發表:10 回覆:17 積分:5 註冊:2010-05-15 發送簡訊給我 |
再次謝謝你
這樣就解決了 ===================引 用 aftcast 文 章=================== 請參考我的範例 [code cpp] /* 蕭沖的範例 20130409 */ wchar_t *pwszUnicode; char *pszUTF8 = NULL; char *pszUTF8_from_ansi = NULL; int iLen = 0; // for pszUTF8 int iLen2 = 0; // for pszUTF8_from_ansi // UTF16 -> UTF8 pwszUnicode = L"你好"; iLen = WideCharToMultiByte(CP_UTF8, 0, (PWSTR) pwszUnicode, -1, NULL, 0, NULL, NULL); pszUTF8 = new char[iLen]; WideCharToMultiByte(CP_UTF8, 0, (PWSTR) pwszUnicode, -1, pszUTF8, iLen, NULL, NULL); // Purpose: from ANSI -> UTF16 -> UTF8 // ANSI -> UTF16 char *pszAnsi = ", I'm Xiao Chong, 蕭沖"; // NOTE HERER !! if pszAnsi isn't CP_ACP, you have to change the correct one // for example if pszAnsi contain simplify chinese, pls use 936 int i = MultiByteToWideChar (CP_ACP, 0, pszAnsi, -1, NULL,0) ; pwszUnicode = new wchar_t[i]; MultiByteToWideChar (CP_ACP, 0, pszAnsi, -1, pwszUnicode,i); // UTF16 -> UTF8 iLen2 = WideCharToMultiByte(CP_UTF8, 0, (PWSTR) pwszUnicode, -1, NULL, 0, NULL, NULL); pszUTF8_from_ansi = new char[iLen2]; WideCharToMultiByte(CP_UTF8, 0, (PWSTR) pwszUnicode, -1, pszUTF8_from_ansi, iLen2, NULL, NULL); //----------Finish ANSI -> UTF8-------------- // below try to cancat 2 UTF8 string int iTotal = iLen iLen2-1; // -1 means originally conatain 2 NULL char *utf8_result = new char[iTotal]; memset(utf8_result,0,iTotal); strcat_s (utf8_result, iTotal, pszUTF8); strcat_s (utf8_result, iTotal, pszUTF8_from_ansi); // test result from C builder 20009 onward only // please remove below test if you are NOT using CB #if __CODEGEARC__ > 0x0593 UTF8String u8 = utf8_result; UnicodeString u16 = u8; // auto converting function by RTL ShowMessage(u16); #endif // end test delete [] pszUTF8; delete [] pwszUnicode; delete [] pszUTF8_from_ansi; [/code]
------
http://xiaolaba.wordpress.com |
本站聲明 |
1. 本論壇為無營利行為之開放平台,所有文章都是由網友自行張貼,如牽涉到法律糾紛一切與本站無關。 2. 假如網友發表之內容涉及侵權,而損及您的利益,請立即通知版主刪除。 3. 請勿批評中華民國元首及政府或批評各政黨,是藍是綠本站無權干涉,但這裡不是政治性論壇! |