Wireshark · Ethereal-dev: Re: [Ethereal-dev] UCP protocols descissor (sms)

Ethereal-dev: Re: [Ethereal-dev] UCP protocols descissor (sms)

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <guy@xxxxxxxxxxxx>

Date: Mon, 15 Dec 2003 01:32:30 -0800

On Thu, Dec 11, 2003 at 11:57:38AM +0300, Taras wrote:
> Is it possible to add support for IRA encoded strings

By "IRA" do you mean the ISO 646 character set?

If so, then would that require a run-time configuration option to
specify which national variant of ISO 646 is being used?

> to ./epan/ftypes/ftype-string.c file?

It's probably not the right thing to do.  ftype-string.c is for the
FT_STRING types, and there should't be different string types for
different character sets.  For one thing, there are fields where the
character set can't be determined at compile time, just as there are
fields where the byte order can't be determined at compile time.  For
another thing, there are a *lot* of character sets that Ethereal should,
eventually, be capable of handling, and it'd probably be best if the
strings handled in epan/ftypes/ftype-string.c were in some standardized
character set and encoding.

In the short term, at least for those national variants that map to ISO
8859/1, the way to handle ISO 646 would be to map it to ISO 8859/1 and
use the mapped string as the value to use in "proto_tree_add_string()".

That's not the ideal answer outside of the Americas and Western Europe
if there are ISO 646 variants that require characters not in 8859/1.  If
that's the case, in the medium term, it might be useful to, at least
temporarily, have some option for Ethereal to specify which 8859/x
variant is being used.

That's not the ideal answer outside of locales using only single-byte
character sets, however.  In the long term, Ethereal should probably
have the string types have values in some ISO 10646 encoding (UTF-8,
etc.), and, for cases where the packet data isn't UTF-8, have the
dissector translate from the character set in the packet (some Windows
code page, some Mac character set, some EUC character set, some other
"extended ASCII" character set, EBCDIC, UTF-8-encoded ISO 10646, 16-bit
Unicode, etc.) into that 10646 encoding.

References:
- [Ethereal-dev] UCP protocols descissor (sms)
  - From: Taras

Prev by Date: [Ethereal-dev] Checking for new capture/loaded file from user code?
Next by Date: Re: [Ethereal-dev] Checking for new capture/loaded file from user code?
Previous by thread: [Ethereal-dev] UCP protocols descissor (sms)
Next by thread: [Ethereal-dev] Path for session dissector (packet-ses.c/packet-ses.h).
Index(es):
- Date
- Thread

Riverbed Cascade Pilot: Take Wireshark to the Next Level - Advanced Triggers and Alerts; Web and VoIP Analytics; Long-Term Trending and Forensics; Deep Packet Analysis with Wireshark

Riverbed Cascade Pilot Personal Edition: Take Wireshark to the Next Level - Advanced Triggers and Alerts; Web and VoIP Analytics; Long-Term Trending and Forensics; Deep Packet Analysis with Wireshark

Riverbed AirPcap: Complete Visibility of Your Wireless Networks; Multi-Channel, Aggregated Analysis; Portable and Versatile; Easy to Setup and Easy to Use; Ready to Power Your Application

$Riverbed TurboCap: Full-Speed GbE Capture; Port Aggregation; Pass-thru Mode; Aggregating Tap; Full-Speed GbE Injection; Exported Interfaces; TurboCap API Developer\'s Pack$