Wireshark-dev: Re: [Wireshark-dev] [Help_Wireshark] difference between fragmentation reassembly
From: John Thacker <johnthacker@xxxxxxxxx>
Date: Wed, 5 Jul 2017 07:29:03 -0400
On Wed, Jul 5, 2017 at 4:41 AM, Pascal Quantin <pascal.quantin@xxxxxxxxx> wrote:


2017-07-05 8:32 GMT+02:00 hhw hhw <hhw.hhw7@xxxxxxxxx>:
my packets are like:
                   sequence id    sequence number  message type  
                           16                       0                0 Begin              
                           16                       1                1 Continue               
                           16                       2                2 End (more_frag=FALSE)

PDUs  with sequence number =0-2 should be reassembled here.
PDUs with sequence number =3 -10 : Dont Fragmented
----------------------------------------------------------------------             
                             5                       11               0 Begin               
                             5                       12               1 Continue         
                             5                       13               2 End  (more_frag=FALSE) 

PDUs  with sequence number =11-13 should be reassembled here.
PDUs with sequence number =13 -19 : Dont Fragmented
-----------------------------------------------------------------------   
                             16                      20               0 Begin                       
                             16                      21               2 End (more_frag=FALSE)

PDUs  with sequence number =20-21 should be reassembled here.   


Now. How can i reassemble fragmented PDUs? which reassembly function? 
fragment_add_seq_check or fragment_add_seq or fragment_add_check ?

It's a shame your protocol does not have an explicit fragment number. I wonder how this is supposed to work if you have a UDP datagram loss... If this can be changed, better do it and it would solve your reassembly API usage headache at the same time.
 
There's a number of protocols that work in a similar fashion without explicit fragment numbers. PPP MP (RFC 1990 / 2686) is one such. Instead of getting "sequence_number,fragment_number, last" as in other protocols, PPP MP provides a single sequence number that is effectively "seqnum + fragnum", and it provides flags for both the first and last fragment of a reassembly. For things that are not fragmented, both the First and Last (Begin and End) flags are set.

That seems like a similar approach to this protocol. See Appendix A of RFC 4623 (PWE3 Fragmentation and Reassembly) for a list of protocols that use this style, including PPP MP (RFC 1990), PWE3 MPLS (RFC 4385), L2TPv2 (RFC 2661), L2TPv3 (RFC 3931), ATM, and Frame Relay. It's not ideal because when there's datagram loss or out of order delivery it's unclear what reassembled packet fragments belong to. E.g., if you receive sequence number 12 and 13 above before receiving 2-11, then you don't know if 12 and 13 are the twelfth and thirteenth fragments in a large packet beginning with 0, or whether the packet beginning with 0 ends sooner. You have to do some juggling.

Anyway, I implemented such an approach for PPP MP that works for captures I have. Take a look at fragment_add_seq_single and fragment_add_seq_single_aging. The latter adds aging that assumes that fragments shouldn't be reassembled together with other fragments that arrived much later. This is necessary for cases with some packet loss combined with the sequence number wrapping around; e.g., if above your packet 21 was lost so the reassembly starting with 20 was never completed, but then the sequence number wrapped around and you got another packet much later in the capture with sequence number 21 incorrect reassembly could occur.

It's not perfect, because it doesn't properly handle when a packet is fragmented across the boundary of where the sequence number wraps around. That wasn't a big deal with PPP MP (24 or 12 bit sequence numbers) as it only affected a small percentage of packets, but I do recall someone on the list wanting to use it with a protocol with much shorter sequence numbers where wraparound was frequent. Since it's not perfect and I didn't have examples of the other protocols, I didn't change it so that the others in that list use the new API function.

John Thacker