Ethereal-users: [Ethereal-users] Out of order reception woes

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: "Stock, Stuart [TKY]" <Stuart.Stock@xxxxxxxx>
Date: Tue, 25 May 2004 11:25:06 +0900
All,

This is a bit complicated, so please bear with me. The short version is: we
have 3 instances of the same application all consuming the same low rate
(max 1.5 MBps) multicast data. Randomly one of the instances will report
out-of-order reception of the data while the other 2 see the data correctly
ordered. All of the instances are on the same switch and consume data from
the same source. Ethereal dumps of these events consistently show a negative
frame.time_delta for the instance reporting the problem and normally ordered
data for the 2 others.

I would think that the issue lies somewhere with the kernel or ethernet
driver since the timestamps show the data was received off-the-wire
correctly, but is mixed-up before being handed to libpcap/my application.
The OS is Red Hat Linux AS 2.1 x386.

My questions are:
1. Anybody experienced similar out-of-order problems and might be able to
shed some light on this?
2. Where do the libpcap timestamps really come from? Looking at my libpcap
sources I see pcap_pkthdr.ts getting its value from ioctl(handle->fd,
SIOCGSTAMP, &pcap_header.ts) but I'm not sure if it's the file handle that
supplies the timestamp or the underlying driver.

Thanks in advance,
Stuart

Details
-------
We have an application that consumes multicast data. The data payload of
each multicast message contains an application sequence number that
monotonically increases. We have three instances of the application running
on three different physical servers and all of the servers are connected to
the same switch and sit in the same VLAN. 

The problem is that occasionally our application reports out-of-sequence
reception of the multicast data (based on inspection of the payload sequence
number). What's interesting is that only one of the instances will report
the out-of-sequence data, the other two see the data as properly sequenced.
Remember that these instances see the same source of the data and sit on the
same switch so if the out-of-order situation was occurring on the network or
at the data source, all three instances should be equally affected. 

After running a tethereal ring-buffer capture on all three servers and
capturing several of the out-of-sequence events, we consistently see a
negative frame.time_delta on the instance that reports the out-of-sequence
data:

tethereal -td -r excerpt.dump
1   0.000000 10.168.31.66 -> 224.10.1.1   UDP Source port: 18891
Destination port: 18891
2   0.002203 10.168.31.66 -> 224.10.1.1   UDP Source port: 18891
Destination port: 18891
3   0.009436 10.168.31.66 -> 224.10.1.1   UDP Source port: 18891
Destination port: 18891
4  -0.000747 10.168.31.66 -> 224.10.1.1   UDP Source port: 18891
Destination port: 18891
5   0.009892 10.168.31.66 -> 224.10.1.1   UDP Source port: 18891
Destination port: 18891
6   0.000833 10.168.31.66 -> 224.10.1.1   UDP Source port: 18891
Destination port: 18891

Examination of the dumps on the other two instances confirms they see the
data correctly ordered. The payload of the out-of-sequence dump contains the
correct (but mis-ordered) data and verifies the application was correctly
reporting the out-of-order situation.

The out-of-order events affect all of the instances and show no bias towards
one instance or the other. I don't think the level of traffic is an issue as
the multicast data is fed from a T1 (1.5 Mbps) and the servers are dual 3.06
GHz Xeon Linux boxes running only this application.


----------------------------------------------------------------------
The information contained herein is confidential and is intended solely for the
addressee. Access by any other party is unauthorised without the express 
written permission of the sender. If you are not the intended recipient, please 
contact the sender either via the company switchboard on +44 (0)20 7623 8000, or
via e-mail return. If you have received this e-mail in error or wish to read our
e-mail disclaimer statement and monitoring policy, please refer to 
http://www.drkw.com/disc/email/ or contact the sender.
----------------------------------------------------------------------