Wireshark-dev: Re: [Wireshark-dev] Unit tests for dissectors
From: Peter Wu <peter@xxxxxxxxxxxxx>
Date: Wed, 19 Dec 2018 22:32:50 +0100
Hi Atli,

On Mon, Dec 17, 2018 at 11:30:28PM +0000, Atli Guðmundsson wrote:
> I've recently created reviews for:
> 
>    - an update of the ASTERIX dissector to support Category 019
>    - unit tests of the Category 019 support
>    - generic unit test dissection validator (used by the above unit tests)
> 
> I will soon add support for Category 020 and would like to know if I'm on
> the right track with the unit tests approach so that I can add on top of
> this or if I should go another way?
> 
> This is because during my development I could only find a small set of unit
> tests for existing dissectors, but no extensive ones.
> I was thus also wondering if there was a particular reason for this?

The current test suite has some tests for core functionality (TCP
reassembly, decryption, display filters, etc.). Dissectors are not
thoroughly tested due to the sheer number of possible cases. Attaching
an unsanitized capture file for every possible dissector and subcases
would significantly increase the repository size and test suite runtime.

If you have a capture file for validation purposes, consider opening a
bug report and attaching the capture file to that (one by one, no zip
file please). Then reference it in the commit message. See
https://wiki.wireshark.org/Development/SubmittingPatches#Writing_a_Good_Commit_Message

To find earlier capture files or tests for a dissector, you could also
consult the git logs for a file, e.g.:

    git log --stat epan/dissectors/packet-asterix.c

Search for "Bug" to find references. Some captures might also exist on
the wiki, in particular https://wiki.wireshark.org/SampleCaptures

If you do create a new test, I usually try to follow these principles:
- Try to minimize the capture file size. For example, use pcap instead of
  pcapng, apply a display filter to extract interesting cases or discard
  unimportant ones (e.g. "tcp.len > 0").
- If necessary, use tools like Scapy to craft packets or simulate
  traffic in a controlled environment.
- Try to combine cases. Executing tshark is expensive, batching multiple
  cases into one will result in faster test completion and make other
  developers happy. I see you are already trying this in one of the
  proposed patches :)
- Skip tests for "simple" dissections. For example, checking every
  header of a HTTP message is probably overkill if the dissection
  routines are similar. Testing one header could already be sufficient.

While we're discussing tests, let me provide some history. Before
Wireshark 2.9.x, the test suite was written in Bash. This was ran fully
sequentially, was not portable to Windows and had rather crude checks
(grep for some output). In this development cycle, these tests were
ported to Python unittest which greatly increased the expressiveness of
tests. It is still heavily reminiscent of Bash though (see the various
self.grepOutput calls).

Since it is Python-based, adding compatibility for pytest was a next
step to speed up tests (pytest -nauto for full test-level parallelism!)
and improve modularity (fixtures!). At the moment pytest is not
mandatory though, so a minimal compatibility shim exists
(test/fixtures.py).

As for dissector tests, there are currently some different approaches:
- Check the output of "tshark -Tfields". Use assertsIn, assertEquals as
  needed.
- Check the output of "tshark -Tjson". Use the json module (possibly
  with matchers.py) to match a subset. See the sharkd suite for example.
- Use display filters and count the number of matching lines.
- Have a Lua dissector that performs the checks. (I am mentioning it for
  completeness, this is probably not useful for your changes).

None of these are exhaustive due to earlier mentioned constraints
(speed, repository size). Another experimental external test framework
was proposed before, but that has not gained much traction so far:
https://github.com/wireshark/happy-shark
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl