Wireshark-dev: Re: [Wireshark-dev] tvb_reported_length() vs tvb_length()
From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Thu, 24 Sep 2009 14:37:48 -0700

On Sep 24, 2009, at 1:08 PM, Bill Meier wrote:

Jeff Morriss wrote:

After having made many a comment (on new dissectors) about using the
wrong function, I start to wonder:

Should we rename tvb_length() to, say, tvb_captured_length() and
tvb_reported_length() to tvb_length()?

Looking at it from the point of the TVB API it doesn't make much sense, but since [it appears that] many dissector writers tend to not think of
snapshot lengths when dealing with TVBs, it might be the best way to
prevent this problem going forward.

Or maybe just rename tvb_length() to tvb_captured_length() ?

I've been thinking the same thing for a while. I think the name dates back to the very early days of tvbuffs, when they *had* only one length; the reported length was added so that we could distinguish between "the dissector ran past the end of the actual packet data" (which means either "the packet is malformed", "the dissector is buggy", or "reassembly needs to be done" - it'd be nice to be able to distinguish between the first and third of those, but that'd take some more work; distinguishing either from "the dissector is buggy" is a bit trickier), and the routine to get it was given a new name but the old name wasn't changed (probably because it was still thought of as "the" length, as it's the amount of data available from the tvbuff).

What are the cases where the use of tvb_length... in a dissctors is valid ?

Off the top of my head, I'd say it's valid if:

Looking at README.developer in detrail I see that:

1. tvb_length is shown as being used before doing a heuristics check in
   new-style (and presumably heuristics) dissectors.
   In fact: I now see that the return is shown as
   tvb_length not tvb_reported_length.
   (Given this, I made an incorrect commit in a recent commit).

...you're doing a check of some sort before you start dissecting, such as a heuristic dissector test, and you need to know how much data is actually available so you don't throw an exception and abort the dissection or...

2. When calling tvb_new-subset.

...you're constructing a new subset and need to set both its lengths correctly or...

Others ?

...you're doing something with trailing data (Ethernet trailers, AAL5 trailers, trailing FCSes, etc.) and you need to avoid throwing an exception before start dissecting the payload.

There might be a few others, but they're probably rare.

Most calls to tvb_new_subset() can probably be replaced in the main branch with calls to tvb_new_subset_remaining() - this includes not only calls to tvb_new_subset(tvb, offset, -1, -1) but also calls that do the equivalent, such as tvb_new_subset(tvb, offset, tvb_length_remaining(tvb, offset), tvb_reported_length_remaining(tvb, offset)) or, worse, tvb_new_subset(tvb, offset, tvb_length_remaining(tvb, offset), tvb_reported_length_remaining(tvb, offset)) (or variants of the same that involve getting the lengths in question into variables and passing those variables, or computing the remaining lengths "by hand" by subtracting the offset from the total lengths).

At some point, we should probably have a new-subset routine that takes a tvbuff, an offset, and a *single* length, which is the desired *reported* length, and that calculates the appropriate captured length.