Wireshark-dev: Re: [Wireshark-dev] Terminating NULL chraracter in RTCP Byereason string
On Aug 5, 2008, at 10:20 AM, Neil Piercy wrote:
The real problem in the spec is here - the leap from "octets of
text" to
"string".
A sequence of octets of text *is* a (text) string. A string is not
necessarily null-terminated (the C programming language, and its
derivatives, nonwithstanding).
It sounds as if whoever wrote RFC 3550 needs to learn the
difference between the words "padded" and "terminated" - they
probably meant to say that the string is null-*padded* to a
4-byte boundary.
Maybe they did know the diference, and maybe they didn't, but what
they
actually said was:
If the string fills the packet to the next 32-bit boundary, the
string is not null terminated.
i.e. they have defined a case in which a "string" is not null
terminated
(i.e. is a sequence of non-null characters only), so Wireshark should
not object to the string not being null terminated in this case.
They also said
The string has the same encoding as that described for SDES.
and what they say for SDES is
Each chunk consists of an SSRC/CSRC identifier followed by a list
of
zero or more items, which carry information about the SSRC/CSRC.
Each chunk starts on a 32-bit boundary. Each item consists of an
8-
bit type field, an 8-bit octet count describing the length of the
text (thus, not including this two-octet header), and the text
itself. Note that the text can be no longer than 255 octets, but
this is consistent with the need to limit RTCP bandwidth
consumption.
The text is encoded according to the UTF-8 encoding specified in
RFC
2279 [5]. US-ASCII is a subset of this encoding and requires no
additional encoding. The presence of multi-octet encodings is
indicated by setting the most significant bit of a character to a
value of one.
Items are contiguous, i.e., items are not individually padded to a
32-bit boundary. Text is not null terminated because some multi-
octet encodings include null octets. The list of items in each
chunk
MUST be terminated by one or more null octets, the first of which
is
interpreted as an item type of zero to denote the end of the list.
No length octet follows the null item type octet, but additional
null
octets MUST be included if needed to pad until the next 32-bit
boundary. Note that this padding is separate from that indicated
by
the P bit in the RTCP header. A chunk with zero items (four null
octets) is valid but useless.
which describes null-padded strings, not null-terminated strings (in
fact, they explicitly say "Text is not null terminated because some
multi-octet encodings include null octets" - although they earlier say
the encoding is UTF-8, which *doesn't* include null octets in multi-
octet encodings).
I.e., RFC 3550 needs a little attention from an editor.
In any case, what Wireshark should do is treat the BYE Reason string
as null-padded, not as "null-terminated except when it isn't".