Wireshark-dev: Re: [Wireshark-dev] Overview of MPLS PW bugs
From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Sat, 7 Jan 2017 17:31:16 -0800
On Jan 7, 2017, at 9:39 AM, Jaap Keuter <jaap.keuter@xxxxxxxxx> wrote:

> There has been a steady stream of MPLS PW related comments and bugs over time,
> and things haven't improved enough, apparently. This text tries to give some
> insight in the issues so that possible solutions cover all cases involved.

I'll start here with a broader discussion of how a protocol specifies the protocol running above it, and how the dissector for the first protocol selects the dissector for the next protocol.

Some protocols have a "next protocol" field or fields (Ethernet, IEEE 802.2, SNAP, IPv4, IPv6, anything with an IANA media type string, ...).  For those, it's easy - use a dissector table, and it will handle 99.99999999999999999% of the cases correctly.  "Decode As" would be necessary only in cases where two protocols are using the same value, either because the old protocol's assignment was revoked and a new protocol given the assignment (which doesn't sound like good practice, unless you have a *really* small set of possible values) or because somebody's not playing by the rules.  Heuristics should only be necessary in cases where the field is optional, such as IANA media type strings.

Some protocols have a field or fields that indicate a source or destination port, or a circuit number, which might be usable *hint* for identifying the next protocol, but which is not sufficient to indicate it (TCP and UDP are the canonical examples of this; ATM's VPI/VCI are another example).  For those, you use a dissector table with dissectors that do checks and reject packets, heuristics, mechanisms for other dissectors to make port-to-protocol assignments if that can be done(e.g., SDP and RTCP setting up RTP sessions), and, when all else fails - which could be fairly common - Decode As.

For the latter category of protocols, the fewer "well known" field values there are, the more you depend on heuristics to avoid Decode As.  MPLS is a protocol with *very* few "well known" label values.

MPLS is a very good example of the last category of protocols; RFC 3032 gives only 16 reserved label values.  That's the problem here; at least with TCP ports, for example, a lot of the well-known ports help.

The protocols we support atop MPLS are:

	I-TDM (Internal TDM)
	Y.1711 (has a reserved label)
	ATM pseudo-wires of various sorts (RFC 4717)
	CESoPSN pseudo-wire (RFC 5086)
	Ethernet pseudo-wire (RFC 4448)
	Frame Relay pseudo-wire (RFC 4619)
	PPP/HDLC pseudo-wires (RFC 4618)
	SAToP (RFC 4553)
	IPv4
	IPv6
	Pseudo-wire Associated Channel Header dissection (RFC 4385)

For most of these, we require Decode As.

The exceptions are:

	IPv4, IPv6, Associated Channel Header, Ethernet

for which, for frames with no explicit binding to a label, we use the first-nibble heuristic, possibly combined with other heuristics.

In addition, the Ethernet pseudo-wire dissector also uses heuristics to determine whether there's a control word or not.  It *looks* as if the ATM pseudo-wire dissector always assumes a control work, even though RFC 4717 says

   The features that the control word provides may not be needed for a
   given ATM PW.  For example, ECMP may not be present or active on a
   given MPLS network, strict frame sequencing may not be required, etc.
   If this is the case, and the control word is not REQUIRED by the
   encapsulation mode for other functions (such as length or the
   transport of ATM protocol specific information), the control word
   provides little value and is therefore OPTIONAL.  Early ATM PW
   implementations have been deployed that do not include a control word
   or the ability to process one if present.  To aid in backwards
   compatibility, future implementations MUST be able to send and
   receive frames without a control word present.

If the control word were *always* present, we wouldn't be having these problems, and people wouldn't be filing bugs.  Thus, the bugs demonstrate that, at least for Ethernet, the control word isn't always present.

Bug 11849 was due to an "is this Ethernet?" heuristic being too strong, by accepting only a small number of Ethertypes; it was fixed by weakening the heuristic not to look at the type/length field at all.

Bug 13039 is due to the "is this Ethernet?" heuristic being too strong, by not accepting frames with local MAC addresses.

Bug 13295 is due to the "is this Ethernet?" heuristic being too weak, by accepting frames with unknown Ethernet types.

Bug 13301 is due to the "is this IPv4? and "is this IPv6?" heuristics being too strong, by accepting, respectively, every frame with 4 in the first nibble as IPv4 and every frame with 6 in the first nibble as IPv6.

There are a number of ways to solve this:

	1) Make the Ethernet dissector like the other pseudo-wire dissectors, and require "Decode As".

	   Presumably this was not done because Ethernet pseudo-wires are popular enough that this would require too much "Decode As".  (And, presumably, the other pseudo-wires are *not* popular enough for this to be an issue.)

	2) Fix the heuristics for Ethernet-without-control-word.

	   This would address bugs 13039 and 13295, by weakening the heuristic where it needs to be weaker and strengthening where it needs to be stronger (the latter also makes the former less likely to break things).

	   It doesn't address bug 13301, however.

	3) Fix the heuristics for Ethernet-without-control-word and even hand frames with a first nibble of 4 or 6 to an "is this Ethernet without a control word?" heuristic dissector - if that dissector says "no", dissect the packets as IPv4 or IPv6.

	   That would also fix 13301, but would run the risk of mis-dissecting some IPv4 or IPv6 frames as Ethernet-without-control-word.

	4) Fix the heuristics for Ethernet-without-control-word and strengthen the first-nibble checks for IPv4 and IPv6 to also check some other fields, such as the "protocol" and "next header" fields.

	   That would also fix 13301, but would run the risk of mis-dissecting some IPv4 or IPv6 frames as Ethernet-without-control-word, although that risk might be lower than with 3).

We might also want to have a preference to deal with the "first nibble of the MAC address is 4 or 6" issue.