Ethereal-dev: Re: [Ethereal-dev] Problem displaying packets containing extended characters in

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Fri, 15 Aug 2003 18:07:15 -0700

On Tuesday, August 12, 2003, at 4:41 PM, spamcontrol2@xxxxxxxxxxx wrote:

I've noticed that starting with 0.9.14 I have problems with captures
occasionally. Packet summaries do not appear in the Info column, and in a
separate window I see the following errors:

(ethereal.exe:1328): Gdk-WARNING **: gdk_text_size: gdk_nmbstowchar_ts failed (ethereal.exe:1328): Gdk-WARNING **: gdk_win32_draw_text: gdk_nmbstowchar_ts failed

One set of these messages seem to appear per packet that has a problem.

I've only noticed this in Windows (Win2k and WinXP, and on multiple systems), and do not have this problem with 0.9.13. I know that I don't have these
problems in Linux with 0.9.14, at least with GTK 1.2.10.

I've only noticed this with SMB packets, and I believe it's with all packets that contain strings with extended characters (in one case, 0x00e4) that would appear in the Info column. These same strings appear to display in the tree,
except that the suspected characters do not display.

GTK+ 1.2[.x]'s routines for displaying strings take, as I remember, strings in whatever the encoding is for the font being used - even if it's a 2-byte encoding. Ethereal tries to get only ISO 8859-1-encoded fonts, so that it doesn't have problems with 2-byte encodings; this means strings containing characters other than 8859-1 encodings don't display well.

GTK+ 1.3[.x], at least on Windows, and GTK+ 2.x, on all platforms, take strings in UTF-8. This causes strings that aren't UTF-8-encoded not to display well - and SMB doesn't use UTF-8, it uses the native character set ("code page") or Unicode.

We should probably eventually have Ethereal use UTF-8, UCS-2, or UCS-4 internally (hopefully they're rich enough to handle all the various weird PC encode pages and Mac character sets - I think they're rich enough to handle all the 8859-n character sets, although CJK unification might cause problems when using various East Asian character sets, and I don't know if they're rich enough for PC or pre-OSX Mac character sets), and attempt to do whatever is necessary to display correctly with GTK+ 1.2[.x] and the UTF-8 GTKs, and to do something appropriate for printing to printers and to text files (for text files on UNIX it should probably use the character set specified by the LANG or appropriate LC_ environment variable; I'm not sure what's appropriate on Windows - 2-byte Unicode text files, such as NetMon 2.x produces, or something else?).