Wireshark · Wireshark-dev: Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

Wireshark-dev: Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

From: Guy Harris <guy@xxxxxxxxxxxx>

Date: Tue, 28 Jun 2011 10:43:22 -0700

On Jun 28, 2011, at 10:27 AM, Guy Harris wrote:

> 	when putting them into a textual representation of the protocol tree or into columns or something else to be shown to humans, map them to UTF-8, with anything that can't be mapped to UTF-8 - including, if the encoding is putatively UTF-8, octet sequences that aren't valid UTF-8 sequences - shown as the Unicode replacement character U+FFFD;

...and, for "for display" conversions, we might want to convert control characters to "Control Pictures" symbols (0x0000 to 0x001F convert to 0x2400 to 0x241f: ␀, ␁, etc. through ␟; 0x007F converts to 0x2421, i.e. ␡ - in the font in which this message is being displayed to me, those have the control character abbreviations displayed in really really small letters, diagonally from upper left to lower right; unfortunately, I see nothing for C1 control characters).

Follow-Ups:
- Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
  - From: Guy Harris

References:
- [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
  - From: Stig Bjørlykke
- Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
  - From: Guy Harris
- Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
  - From: Stig Bjørlykke
- Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
  - From: Guy Harris

Prev by Date: Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
Next by Date: Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
Previous by thread: Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
Next by thread: Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)
Index(es):
- Date
- Thread