Wireshark-bugs: [Wireshark-bugs] [Bug 10445] New: UTF-8 characters end up escaped in psml output
Bug ID |
10445
|
Summary |
UTF-8 characters end up escaped in psml output
|
Product |
Wireshark
|
Version |
1.12.0
|
Hardware |
All
|
OS |
All
|
Status |
UNCONFIRMED
|
Severity |
Normal
|
Priority |
Low
|
Component |
TShark
|
Assignee |
bugzilla-admin@wireshark.org
|
Reporter |
joe@qacafe.com
|
Build Information:
TShark 1.12.0 (Git Rev Unknown from unknown)
Copyright 1998-2014 Gerald Combs <gerald@wireshark.org> and contributors.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Compiled (64-bit) with GLib 2.26.1, with libpcap, with libz 1.2.8, without
POSIX
capabilities, without libnl, without SMI, with c-ares 1.7.0, with Lua 5.1,
without Python, with GnuTLS 2.8.5, with Gcrypt 1.4.5, with MIT Kerberos,
without
GeoIP.
Running on Linux 2.6.32-358.el6.x86_64, with locale en_US.UTF-8, with libpcap
version 1.4.0, with libz 1.2.8.
Intel(R) Core(TM) i5-4430 CPU @ 3.00GHz
Built using gcc 4.4.7 20120313 (Red Hat 4.4.7-4).
--
Some protocol dissectors are now using UTF-8 characters. Notably TCP, UDP, and
SCTP are using UTF-8 \xe2\x86\x92 for a UTF-8 right arrow instead of the ascii
friendly " > ".
Unfortunately the psml and pdml output calls print_escaped_xml() which ends up
escaping UTF-8 characters. UTF-8 is the default encoding for XML and these
characters don't need really need to be escaped.
The escape UTF-8 characters end up in the psml where they are not very useful.
See the last <section> below.
<packet>
<section>31843</section>
<section>568.363627</section>
<section>193.37.150.253</section>
<section>1.2.3.4</section>
<section>TCP</section>
<section>60</section>
<section>80\xe2\x86\x924267 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0</section>
</packet>
You are receiving this mail because:
- You are watching all bug changes.