Wireshark-users: [Wireshark-users] TCP question: retransmission or prodding the peer?
Date: Thu, 20 Feb 2014 11:04:52 +0000 (GMT)
Hi all

I am trying to track down a problem with an embedded device (card reader, attached to a printer/copier) which is part of a "follow me printing" solution:  User starts print job, walks to the next available print machine, inserts card/badge, gets shown the list of his/her queued jobs, selects one or more and prints it, and his card gets billed by-the-page, etc etc.

This is usually done using a sequence of TCP sessions between card reader and card server. Eventually, the card server will notify the print server to push the selected print job to the printer, and will maintain a flow of packets to the card reader, sending a billing notification for every single page printed.

Every so often, there seems to be a stall in communication between the card reader and the server, but only during the very first TCP session.

After the full three way handshake and two or three more packets, there is a stall of ca 2.5seconds. This is delay is noticeable to the user - and this is what we're trying to track down.

After that delay, the card server sends a new packet with:
  - 1 byte payload (and 1 byte less of padding in the IP header)
  - PSH set
  - the **same** SEQ/ACK numbers as the packet before the delay (see frames 9 and 10).

A similar effect can be observed in frames 5 and 6, but there the "delay" is only 7.5ms. This time, the card reader resends a packet to the server.
  - 1 byte payload (and 1 byte less of padding in the IP header)
  - PSH set
  - the **same** SEQ/ACK numbers as the packet before the delay (see frames 9 and 10).

The capture was done on a passive 10Mbit/s Hub between Card Reader's switch port (Cisco2960S), using the onboard Intel NIC of a Lenovo T520.

I was considering that the card reader's ACK might have got lost somewhere had CRC errors; or the Intel NIC might have them forwarded them to libpcap. 

However, I doubt that there are any invalid frames at all. During the months we spent to track down the issue, the Cisco's switch port never saw any invalid incoming frame (CRC, undersize etc), during the capture with the 10Mbit/s hub, there wasn't even a single collision on that given port, although it was running "10-half" at the time.

Upstream bandwidth from the access switch is plentiful, and we have no indication that quality suffers anywhere in the network - and they're doing VoIP and all.

QUESTIONS
=========
a) can these observations be called "retransmissions"?
b) if yes, is there a reason why Wireshark's  [ Version 1.10.5 (SVN Rev 54262 from /trunk-1.10) ] SEQ/ACK analysis would not detect them as such?
c) are there any knobs to turn in Wireshark to make this form of "retransmissions" show up ?

d) is sending "same SEQ/ACK plus PSH" a known form of "cattle prodding a lagging TCP peer"?
e) is 2.5sec a known "wait time" or "timeout" in common TCP implementations? (from which I will conclude that there must've been some packet loss all the same)


Thanks a lot for your thoughts!

Marc



Attachment: to retransmit or not to retransmit.pcapng
Description: Binary data