|
Post by Cation on Apr 5, 2004 17:21:49 GMT -5
Does anyone know how to get text received from msn (I think it's UTF-8) to display correctly?
I'm using XP and VB 6.0.
Thanks
|
|
Tewl
xTeam
C# & VB Developer
Posts: 495
|
Post by Tewl on Apr 5, 2004 22:14:57 GMT -5
you have to convert the text to utf-8 which is not hard but you must use a control that has support for unicode such as the controls you get with a microsoft office license.
Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As String, ByVal cchMultiByte As Long, ByVal lpWideCharStr As String, ByVal cchWideChar As Long) As Long Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As String, ByVal cchWideChar As Long, ByVal lpMultiByteStr As String, ByVal cchMultiByte As Long, ByVal lpDefaultChar As String, ByVal lpUsedDefaultChar As Long) As Long
|
|
|
Post by Dangermouse on Apr 6, 2004 2:54:12 GMT -5
Want to atleast give a hint at what kind of parameters you have to pass there lol
|
|
|
Post by AbsoluteGenius on Apr 6, 2004 3:36:24 GMT -5
This might help
This code is taken from inside my Connection OCX, IRCDominator uses it to convert UTF-8 but, the controls used (User list and chat box) are not Unicode compliant so it dont look converted,
Straight from a complate bas module
' ----------------------------------------------------------------------------- ' InCodePage ' Lots of utility functions for deteremining which codepage and LCID ' to use for an arbitrary string, very helpful adjunct in multinational ' applications where you may not know the language. Option Compare Binary Option Explicit
Public Const CP_ACP = 0 Public Const CP_NONE = 0 Public Const CP_OEMCP = 1 Public Const CP_WINANSI = 1004 Public Const CP_WINUNICODE = 1200 Public Const MB_PRECOMPOSED = &H1 Public Const MB_COMPOSITE = &H2 Public Const MB_USEGLYPHCHARS = &H4
' //Arabic Public Const CP_AWIN = 101 ' //Bidi Windows codepage Public Const CP_709 = 102 ' //MS-DOS Arabic Support CP 709 Public Const CP_720 = 103 ' //MS-DOS Arabic Support CP 720 Public Const CP_A708 = 104 ' //ASMO 708 Public Const CP_A449 = 105 ' //ASMO 449+ Public Const CP_TARB = 106 ' //MS Transparent Arabic Public Const CP_NAE = 107 ' //Nafitha Enhanced Arabic Char Set Public Const CP_V4 = 108 ' //Nafitha v 4.0 Public Const CP_MA2 = 109 ' //Mussaed Al Arabi (MA/2) CP 786 Public Const CP_I864 = 110 ' //IBM Arabic Supplement CP 864 Public Const CP_A437 = 111 ' //Ansi 437 codepage Public Const CP_AMAC = 112 ' //Macintosh Cod Page
' //Hebrew Public Const CP_HWIN = 201 ' //Bidi Windows codepage Public Const CP_862I = 202 ' //IBM Hebrew Supplement CP 862 Public Const CP_7BIT = 203 ' //IBM Hebrew Supplement CP 862 Folded Public Const CP_ISO = 204 ' //ISO Hebrew 8859-8 Character Set Public Const CP_H437 = 205 ' //Ansi 437 codepage Public Const CP_HMAC = 206 ' //Macintosh Cod Page
' /************************************************************************* ' Code Pages ' *************************************************************************/ Public Const CP_OEM_437 = 437 Public Const CP_ARABICDOS = 708 Public Const CP_DOS720 = 720 Public Const CP_IBM850 = 850 Public Const CP_IBM852 = 852 Public Const CP_DOS862 = 862 Public Const CP_IBM866 = 866 Public Const CP_THAI = 874 Public Const CP_JAPAN = 932 Public Const CP_CHINA = 936 Public Const CP_KOREA = 949 Public Const CP_TAIWAN = 950 Public Const CP_EASTEUROPE = 1250 Public Const CP_RUSSIAN = 1251 Public Const CP_WESTEUROPE = 1252 Public Const CP_GREEK = 1253 Public Const CP_TURKISH = 1254 Public Const CP_HEBREW = 1255 Public Const CP_ARABIC = 1256 Public Const CP_BALTIC = 1257 Public Const CP_VIETNAMESE = 1258 Public Const CP_ASCII = 20127 Public Const CP_RUSSIANKOI8R = 20866 Public Const CP_RUSSIANKOI8U = 21866 Public Const CP_ISOLATIN1 = 28591 Public Const CP_ISOEASTEUROPE = 28592 Public Const CP_ISOTURKISH = 28593 Public Const CP_ISOBALTIC = 28594 Public Const CP_ISORUSSIAN = 28595 Public Const CP_ISOARABIC = 28596 Public Const CP_ISOGREEK = 28597 Public Const CP_ISOHEBREW = 28598 Public Const CP_ISOTURKISH2 = 28599 Public Const CP_ISOLATIN9 = 28605 Public Const CP_HEBREWLOG = 38598 Public Const CP_USER = 50000 Public Const CP_AUTOALL = 50001 Public Const CP_JAPANNHK = 50220 Public Const CP_JAPANESC = 50221 Public Const CP_JAPANSIO = 50222 Public Const CP_KOREAISO = 50225 Public Const CP_TAIWANISO = 50227 Public Const CP_CHINAISO = 50229 Public Const CP_AUTOJAPAN = 50932 Public Const CP_AUTOCHINA = 50936 Public Const CP_AUTOKOREA = 50949 Public Const CP_AUTOTAIWAN = 50950 Public Const CP_AUTORUSSIAN = 51251 Public Const CP_AUTOGREEK = 51253 Public Const CP_AUTOARABIC = 51256 Public Const CP_JAPANEUC = 51932 Public Const CP_CHINAEUC = 51936 Public Const CP_KOREAEUC = 51949 Public Const CP_TAIWANEUC = 51950 Public Const CP_CHINAHZ = 52936 Public Const CP_MAC_ROMAN = 10000 Public Const CP_MAC_JAPAN = 10001 Public Const CP_MAC_ARABIC = 10004 Public Const CP_MAC_GREEK = 10006 Public Const CP_MAC_CYRILLIC = 10007 Public Const CP_MAC_LATIN2 = 10029 Public Const CP_MAC_TURKISH = 10081 Public Const CP_DEFAULT = CP_ACP
Public Const CP_JOHAB = 1361 Public Const CP_SYMBOL = 42 Public Const CP_UTF8 = 65001 Public Const CP_UTF7 = 65000 Public Const CP_UNICODELITTLE = 1200 Public Const CP_UNICODEBIG = 1201 Public Const CP_UNKNOWN = -1
Public Const MB_ERR_INVALID_CHARS = &H8 ' /* error for invalid chars */
Public Const WC_DEFAULTCHECK = &H100 ' /* check for default char */ Public Const WC_COMPOSITECHECK = &H200 ' /* convert composite to precomposed */ Public Const WC_DISCARDNS = &H10 ' /* discard non-spacing chars */ Public Const WC_SEPCHARS = &H20 ' /* generate separate chars */ Public Const WC_DEFAULTCHAR = &H40 ' /* replace w/ default char */
Private Type FONTSIGNATURE fsUsb(4) As Long fsCsb(2) As Long End Type
Private Type CHARSETINFO ciCharset As Long ciACP As Long fs As FONTSIGNATURE End Type
Private Const LOCALE_IDEFAULTCODEPAGE = &HB Private Const LOCALE_IDEFAULTANSICODEPAGE = &H1004 Private Const TCI_SRCCODEPAGE = 2
Private Declare Function GetACP Lib "Kernel32" () As Long Private Declare Function GetLocaleInfoA Lib "Kernel32" (ByVal Locale As Long, ByVal LCType As Long, ByVal lpLCData As String, ByVal cchData As Long) As Long Private Declare Function GetSystemDefaultLCID Lib "Kernel32" () As Long Private Declare Function IsWindowUnicode Lib "user32" (ByVal hwnd As Long) As Long Private Declare Function TranslateCharsetInfo Lib "gdi32" (lpSrc As Long, lpcs As CHARSETINFO, ByVal dwFlags As Long) As Long
' The OS functions, if you prefer to use them Private Declare Function MultiByteToWideChar Lib "Kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long Private Declare Function WideCharToMultiByte Lib "Kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpDefaultChar As String, lpUsedDefaultChar As Long) As Long
' Now, pick which dll to use and comment out the other ones. ' Private Declare Function MsoCpgFromLid Lib "c:\program files\Microsoft Office 9\office\mso9.dll" Alias "#307" (ByVal lid As Long) As Long ' Private Declare Function MsoCpgFromLid Lib "c:\program files\Microsoft Office 8\office\mso97.dll" Alias "#307" (ByVal lid As Long) As Long Private Declare Function MsoCpgFromLid Lib "c:\program files\common files\microsoft shared\vba\mso97rt.dll" Alias "#307" (ByVal lid As Long) As Long ' Private Declare Function MsoMultiByteToWideChar Lib "c:\program files\Microsoft Office 9\office\mso9.dll" Alias "#778" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long ' Private Declare Function MsoMultiByteToWideChar Lib "c:\program files\Microsoft Office 8\office\mso97.dll" Alias "#778" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long Private Declare Function MsoMultiByteToWideChar Lib "c:\program files\common files\microsoft shared\vba\mso97rt.dll" Alias "#778" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long ' Private Declare Function MsoWideCharToMultiByte Lib "c:\program files\Microsoft Office 9\office\mso9.dll" Alias "#915" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, lpUsedDefaultChar As Long) As Long ' Private Declare Function MsoWideCharToMultiByte Lib "c:\program files\Microsoft Office 8\office\mso97.dll" Alias "#915" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, lpUsedDefaultChar As Long) As Long Private Declare Function MsoWideCharToMultiByte Lib "c:\program files\common files\microsoft shared\vba\mso97rt.dll" Alias "#915" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, lpUsedDefaultChar As Long) As Long
And of course all the formatting gets removed :S
Due to size restirctions I have had to split this post
|
|
|
Post by AbsoluteGenius on Apr 6, 2004 3:37:16 GMT -5
' -------------------------------- ' AToW ' ANSI to UNICODE conversion, via a given codepage. ' -------------------------------- Public Function AToW(ByVal st As String, Optional ByVal cpg As Long = -1, Optional lFlags As Long = 0) As String Dim stBuffer As String Dim cwch As Long Dim pwz As Long Dim pwzBuffer As Long If cpg = -1 Then cpg = GetACP() pwz = StrPtr(st) cwch = MultiByteToWideChar(cpg, lFlags, pwz, -1, 0&, 0&) ' stBuffer = String$(cwch + 1, vbNullChar) stBuffer = String$(cwch + 1, " ") pwzBuffer = StrPtr(stBuffer) cwch = MultiByteToWideChar(cpg, lFlags, pwz, -1, pwzBuffer, Len(stBuffer)) AToW = Left$(stBuffer, cwch - 1) End Function ' ---------------------------------------------------------------------------------------- ' AToWEx ' ANSI to UNICODE conversion, via a given an lcid. ' ---------------------------------------------------------------------------------------- Public Function AToWEx(ByVal st As String, Optional ByVal lcid As Long = -1, Optional lFlags As Long = 0) As String Dim cpg As Long Dim lpUsedDefaultChar As Long ' If no codepage is specified, use the default system codepage If lcid = -1 Then lcid = GetSystemDefaultLCID() cpg = ChsFromLocaleEx(lcid)
AToWEx = AToW(st, cpg, lFlags) End Function
' -------------------------------- ' WToA ' UNICODE to ANSI conversion, via a given codepage ' -------------------------------- Public Function WToA(ByVal st As String, Optional ByVal cpg As Long = -1, Optional lFlags As Long = 0) As String Dim stBuffer As String Dim cwch As Long Dim pwz As Long Dim pwzBuffer As Long Dim lpUsedDefaultChar As Long If cpg = -1 Then cpg = GetACP() pwz = StrPtr(st) cwch = WideCharToMultiByte(cpg, lFlags, pwz, -1, 0&, 0&, "?", ByVal 0&) stBuffer = String$(cwch + 1, " ") pwzBuffer = StrPtr(stBuffer) cwch = WideCharToMultiByte(cpg, lFlags, pwz, -1, pwzBuffer, Len(stBuffer), ByVal 0&, ByVal 0&) WToA = Left$(stBuffer, cwch - 1) End Function
' ---------------------------------------------------------------------------------------- ' WToAEx ' UNICODE to ANSI conversion, via a given an lcid. ' ---------------------------------------------------------------------------------------- Public Function WToAEx(ByVal st As String, Optional ByVal lcid As Long = -1, Optional lFlags As Long = 0) As String Dim cpg As Long Dim lpUsedDefaultChar As Long ' If no codepage is specified, use the default system codepage If lcid = -1 Then lcid = GetSystemDefaultLCID() cpg = ChsFromLocaleEx(lcid)
WToAEx = WToA(st, cpg, lFlags) End Function ' ---------------------------------------------------------------------------------------- ' FStringInCpg ' Tests whether a particular string fits within a given codepage, ' given the string and a codepage. ' ---------------------------------------------------------------------------------------- Public Function FStringInCpg(ByVal st As String, Optional ByVal cpg As Long = -1) As Boolean Dim cwch As Long Dim lpUsedDefaultChar As Long ' If no codepage is specified, use the default system codepage If cpg = -1 Then cpg = GetACP() ' We are not converting, simply determining if the system plans ' on using the default char at all (which it does when it cannot ' map a char in the string) cwch = MsoWideCharToMultiByte(cpg, 0&, StrPtr(st), -1, 0&, 0&, ByVal 0&, lpUsedDefaultChar) FStringInCpg = (CBool(lpUsedDefaultChar) = False) End Function
' ---------------------------------------------------------------------------------------- ' FStringInCpgEx ' Tests whether a particular string fits within a given codepage, ' given the string and an LCID. ' ---------------------------------------------------------------------------------------- Public Function FStringInCpgEx(ByVal st As String, Optional ByVal lcid As Long = -1) As Boolean Dim cwch As Long Dim cpg As Long Dim lpUsedDefaultChar As Long ' If no codepage is specified, use the default system codepage If lcid = -1 Then lcid = GetSystemDefaultLCID() cpg = ChsFromLocaleEx(lcid) FStringInCpgEx = FStringInCpg(st, cpg) End Function
' ---------------------------------------------------------------------------------------- ' ChsFromLocale ' The OS version ' ---------------------------------------------------------------------------------------- Public Function ChsFromLocale(lcid As Long) As Long Dim cwc As Long Dim cpg As Long Dim stBuffer As String Dim cs As CHARSETINFO stBuffer = String$(10, vbNullChar) cwc = GetLocaleInfoA(lcid, LOCALE_IDEFAULTANSICODEPAGE, _ stBuffer, Len(stBuffer))
If cwc > 0 Then cpg = Val(Left$(stBuffer, cwc - 1)) If TranslateCharsetInfo(ByVal cpg, cs, _ TCI_SRCCODEPAGE) Then ChsFromLocale = cs.ciCharset End If End If End Function
' ---------------------------------------------------------------------------------------- ' ChsFromLocaleEx ' The MSO version, much better at this sort of thing ' ---------------------------------------------------------------------------------------- Public Function ChsFromLocaleEx(lcid As Long) As Long ChsFromLocaleEx = MsoCpgFromLid(lcid) End Function Public Function EncodeUTF8(ByVal sDecodeString As String) As String If sDecodeString = vbNullString Then Exit Function EncodeUTF8 = StrConv(WToA(sDecodeString, CP_UTF8, 0), vbUnicode) End Function Public Function DecodeUTF8(ByVal sDecodeString As String) As String Dim sTemp As String
If sDecodeString = vbNullString Then Exit Function sTemp = WToA(sDecodeString, CP_ACP) DecodeUTF8 = AToW(sTemp, CP_UTF8) End Function
|
|
|
Post by Alexander on Apr 6, 2004 6:54:59 GMT -5
whoa big amount of code there
|
|
Tewl
xTeam
C# & VB Developer
Posts: 495
|
Post by Tewl on Apr 6, 2004 7:24:55 GMT -5
good i didnt want to write an example anyway lol
|
|
|
Post by Cation on Apr 6, 2004 10:16:49 GMT -5
Thanks Tewl and AbsoluteGenius. I've been using MultiByteToWideChar and WideCharToMultiByte with good success. I asked the question because I was curious if there was anything I was missing. I'll study AG's code to see if there is anything in there that solves a problem I have with specfic combinations of utf-8 bytes. For anyone else that wants to experiment with the wonderful world of multi byte character sets (MBCS), I'll post some links to some unicode aware controls. MS Forms 2.0 has a nice collection of unicode enabled controls but you can't legally distribute them with your application. However, you can have your users install setuppad.exe which will legally install the controls on their computer. msdn.microsoft.com/library/default.asp?url=/library/en-us/dnaxctrl/html/cpad.asp For richedit controls that support unicode, I've had good luck with the controls on these sites br] www.mvps.org/emorcillo/vb6/controls/index.shtmlwww.vbaccelerator.com/home/VB/Code/Controls/index.aspFor a good replacement for the listview control, I use vbaccelerator's sGrid2 www.vbaccelerator.com/home/VB/Code/Controls/index.aspAll the controls discussed above are free Of course, there are also many 3rd party unicode controls that are not free. If anyone has a question about any of those, I might have tried them. Thanks again! Cation
|
|
|
Post by anthrax on Apr 6, 2004 11:12:56 GMT -5
^ some good info thanks
|
|
SudsyUncle®
New Member
Just because I'm paranoid, DOES NOT mean that noone is out to get me!
Posts: 58
|
Post by SudsyUncle® on Apr 17, 2004 0:18:26 GMT -5
This is a fairly decent module i found( i think it was on old board) for nicks, could be used as conversion for full text. Been too long ago to remember to whom the credit goes, sry Function eNick(ByVal Nick As String) As String Dim chr1, chr2, chr3, chr4, chr5, strlength Nick = Replace(Nick, "%20", " ") strlength = Len(Nick)
For z = 1 To strlength chr1 = Asc(Mid(Nick, z, 1)) If chr1 >= 0 And chr1 <= 127 Then decodedNick = decodedNick & ChrW(chr1) ElseIf chr1 >= 192 And chr1 <= 223 Then chr2 = Asc(Mid(Nick, z + 1, 1)) decodedNick = decodedNick & ChrW((chr1 - 192) * 64 + (chr2 - 128)) z = z + 1 ElseIf chr1 >= 224 And chr1 <= 229 Then chr2 = Asc(Mid(Nick, z + 1, 1)) chr3 = Asc(Mid(Nick, z + 2, 1)) decodedNick = decodedNick & ChrW((chr1 - 224) * 4096 + (chr2 - 128) * 64 + (chr3 - 128)) z = z + 2 ElseIf chr1 >= 240 And chr1 <= 247 Then chr2 = Asc(Mid(Nick, z + 1, 1)) chr3 = Asc(Mid(Nick, z + 2, 1)) chr4 = Asc(Mid(Nick, z + 3, 1)) decodedNick = decodedNick & ChrW((chr1 - 240) * 262144 + (chr2 - 128) * 4096 + (chr3 - 128) * 64 + (chr4 - 128)) z = z + 3 ElseIf chr1 >= 248 And chr1 <= 251 Then chr2 = Asc(Mid(Nick, z + 1, 1)) chr3 = Asc(Mid(Nick, z + 2, 1)) chr4 = Asc(Mid(Nick, z + 3, 1)) chr5 = Asc(Mid(Nick, z + 4, 1)) decodedNick = decodedNick & ChrW((chr1 - 248) * 16777216 + (chr2 - 128) * 262144 + (chr3 - 128) * 4096 + (chr4 - 128) * 64 + (chr5 - 128)) z = z + 4 ElseIf chr1 >= 252 And chr1 <= 253 Then chr2 = Asc(Mid(Nick, z + 1, 1)) chr3 = Asc(Mid(Nick, z + 2, 1)) chr4 = Asc(Mid(Nick, z + 3, 1)) chr5 = Asc(Mid(Nick, z + 4, 1)) chr6 = Asc(Mid(Nick, z + 5, 1)) decodedNick = decodedNick & ChrW((chr1 - 252) * 1073741824 + (chr2 - 128) * 16777216 + (chr3 - 128) * 262144 + (chr4 - 128) * 4096 + (chr5 - 128) * 64 + (chr6 - 128)) z = z + 5 Else decodedNick = decodedNick & ChrW(chr1) End If Next eNick = decodedNick End Function
|
|
|
Post by Webagent on Apr 17, 2004 0:37:41 GMT -5
|
|