Wireshark · Wireshark-dev: Re: [Wireshark-dev] non-ASCII stuff in manuf

Wireshark-dev: Re: [Wireshark-dev] non-ASCII stuff in manuf

From: Guy Harris <guy@xxxxxxxxxxxx>

Date: Mon, 25 Oct 2010 16:19:42 -0700

On Oct 25, 2010, at 3:13 PM, Jeff Morriss wrote:

> The manuf entry for that is (escaped):
> 
> 00:50:C2:0A:20:00/36    J\xC3\xAF\xC2\xBF\xC2\xBDgerCom
> 
> And, actually, Wireshark displays it pretty much like Unidecode() does:
> 
> Ji?1/2gerC

And so does Safari - if I do

	http://www.google.com/search?client=safari&rls=en&q=00:50:C2:0A:20:00&ie=UTF-8&oe=UTF-8

it finds

	http://hwaddress.com/?q=J%20&%20F%20Labs

which speaks of

	Jï¿½ger Computergesteuerte Messtechnik GmbH

and, not to any great surprise of mine, that's supposed to be

	http://www.adwin.de/index-us.html

"Jäger Computergesteuerte Messtechnik".

Now, "a with diaresis" shouldn't take 6 octets to encode it in UTF-8 - that's 0x00E4 in Unicode, which turns into 0xC3 0xA4 in UTF-8.  Even if you use "a" plus a combining diaresis, it's not 0xC3 0xAF 0xC2 0xBF 0xC2 0xBD in UTF-8 - that UTF-8 sequence is 0x00EF 0x00BF 0x00BD in Unicode, which is, not surprisingly, ï ¿ ½.  (And it's not ISO 8859/1, either - that'd just be 0xE4 for "a with diaresis".  I've no idea what character encoding would turn "a with diaresis", or "a" plus a combining diaresis, into that octet sequence.

I suspect that's just something corrupted in the entry in the IEEE database; perhaps we need to override that entry when we generate our manuf file.

Follow-Ups:
- Re: [Wireshark-dev] non-ASCII stuff in manuf
  - From: Guy Harris

References:
- [Wireshark-dev] non-ASCII stuff in manuf
  - From: Jeff Morriss
- Re: [Wireshark-dev] non-ASCII stuff in manuf
  - From: Guy Harris
- Re: [Wireshark-dev] non-ASCII stuff in manuf
  - From: Jeff Morriss

Prev by Date: Re: [Wireshark-dev] I want to print the string tvb->real_data on the ubuntu terminal
Next by Date: Re: [Wireshark-dev] slow when loading big pcaps
Previous by thread: Re: [Wireshark-dev] non-ASCII stuff in manuf
Next by thread: Re: [Wireshark-dev] non-ASCII stuff in manuf
Index(es):
- Date
- Thread