Ethereal-dev: Re: [Ethereal-dev] Problem displaying packets containing extended characters in

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Fri, 15 Aug 2003 18:07:15 -0700
On Tuesday, August 12, 2003, at 4:41 PM, spamcontrol2@xxxxxxxxxxx wrote:

I've noticed that starting with 0.9.14 I have problems with captures
occasionally. Packet summaries do not appear in the Info column, and in a
separate window I see the following errors:

(ethereal.exe:1328): Gdk-WARNING **: gdk_text_size: gdk_nmbstowchar_ts failed (ethereal.exe:1328): Gdk-WARNING **: gdk_win32_draw_text: gdk_nmbstowchar_ts failed
One set of these messages seem to appear per packet that has a problem.

I've only noticed this in Windows (Win2k and WinXP, and on multiple systems), and do not have this problem with 0.9.13. I know that I don't have these
problems in Linux with 0.9.14, at least with GTK 1.2.10.

I've only noticed this with SMB packets, and I believe it's with all packets that contain strings with extended characters (in one case, 0x00e4) that would appear in the Info column. These same strings appear to display in the tree,
except that the suspected characters do not display.
GTK+ 1.2[.x]'s routines for displaying strings take, as I remember, 
strings in whatever the encoding is for the font being used - even if 
it's a 2-byte encoding.  Ethereal tries to get only ISO 8859-1-encoded 
fonts, so that it doesn't have problems with 2-byte encodings; this 
means strings containing characters other than 8859-1 encodings don't 
display well.
GTK+ 1.3[.x], at least on Windows, and GTK+ 2.x, on all platforms, take 
strings in UTF-8.  This causes strings that aren't UTF-8-encoded not to 
display well - and SMB doesn't use UTF-8, it uses the native character 
set ("code page") or Unicode.
We should probably eventually have Ethereal use UTF-8, UCS-2, or UCS-4 
internally (hopefully they're rich enough to handle all the various 
weird PC encode pages and Mac character sets - I think they're rich 
enough to handle all the 8859-n character sets, although CJK 
unification might cause problems when using various East Asian 
character sets, and I don't know if they're rich enough for PC or 
pre-OSX Mac character sets), and attempt to do whatever is necessary to 
display correctly with GTK+ 1.2[.x] and the UTF-8 GTKs, and to do 
something appropriate for printing to printers and to text files (for 
text files on UNIX it should probably use the character set specified 
by the LANG or appropriate LC_ environment variable; I'm not sure 
what's appropriate on Windows - 2-byte Unicode text files, such as 
NetMon 2.x produces, or something else?).