Wireshark-dev: Re: [Wireshark-dev] Static analysis and FT_STRING encodings
From: Evan Huus <eapache@xxxxxxxxx>
Date: Thu, 12 Apr 2012 15:09:11 -0400
On Thu, Apr 12, 2012 at 2:33 PM, Guy Harris <guy@xxxxxxxxxxxx> wrote:

On Apr 12, 2012, at 10:47 AM, Evan Huus wrote:

> 2. Most dissectors add FT_STRINGs with an encoding value of 'ENC_ANSII | ENC_NA'. Based on the comments in epan/proto.h this doesn't make sense (they should be 'ENC_ASCII' only?),

Which comments?
 
The comment for ENC_NA reads as:
 * For protocols (FT_PROTOCOL), aggregate items with subtrees (FT_NONE),
 * opaque byte-array fields (FT_BYTES), and other fields where there
 * is no choice of encoding (either because it's "just a bucket
 * of bytes" or because the encoding is completely fixed), we
 * have ENC_NA (for "Not Applicable").

My interpretation of this is that it's not *wrong* to say 'ENC_ASCII | ENC_NA', but simply redundant. I've been reading ENC_NA as "endianness is irrelevant but we need a placeholder here". Since ENC_ASCII implies everything we need to know about byte order, adding an ENC_NA isn't really useful.

We currently have no ENC_UTF_16, but, when we do, it would have to be ORed either with ENC_BIG_ENDIAN or ENC_LITTLE_ENDIAN, so, at least for some character encodings, a byte-order specification would make sense.
 
Agreed.
 
ASCII has no byte-order issues, and the convention is that ENC_NA is used to explicitly indicate that the byte order is not applicable.
 
But, as above, we already get that from the fact that it's ASCII in the first place. If the agreed style is to be explicit in these cases, that's fine, but then that should be made clearer in the comments.

(As an aside, I'm assuming based on this that the few places which do 'ENC_ASCII | ENC_BIG_ENDIAN' or 'ENC_ASCII | ENC_LITTLE_ENDIAN' are actually wrong?)

The comment for ENC_NA should probably be updated to reflect that.
 
Yes, a clearer comment would help. Appending the following to what's already there would be good:

"For encodings (such as ENC_ASCII) where the encoding already specifies the byte order (or lack thereof), ENC_NA should be explicitly ORed in for completeness."

> and they cause CppCheck to complain because both ENC_s are #defined to 0 (which makes or-ing them a no-op).

On occasion, C/C++/etc. programmers might have values that consist of multiple bitfields and that valid values of those types might have more than one of said bitfields with values of 0, and therefore that there are legitimate reasons to OR together various constants that happen to be 0.  ENC_ values happen to be one of those occasions.
 
True.

Given that, for cppcheck, "The goal is no false positives."

       http://sourceforge.net/apps/mediawiki/cppcheck/index.php?title=Main_Page

they either need to stop warning about that or provide a way to control which cases of ORing 0's together to warn about.

They do provide a way to control that warning. I just wasn't sure if it was right to filter those instances or not. Now that I know, I can adjust the CppCheck parameters accordingly.

Thanks,
Evan