Wireshark-dev: Re: [Wireshark-dev] Collection of captures for each supported dissector?
From: Peter Wu <peter@xxxxxxxxxxxxx>
Date: Mon, 30 Jun 2014 16:52:30 +0200
(adding back the list, adding Gerald)

On Monday 30 June 2014 09:33:29 Evan Huus wrote:
> On Mon, Jun 30, 2014 at 9:05 AM, Peter Wu <peter@xxxxxxxxxxxxx> wrote:
> > On Monday 30 June 2014 07:12:56 Evan Huus wrote:
> > > The "menagerie" is our collection of capture files that the fuzz-bot
> > > uses to
> > > test with. It contains a substantial number of files across as many
> > > protocols as we have been able to accumulate. However, I am not sure it
> > > is entirely publicly accessible?
> > 
> > I have seen the menagerie mentioned in bug reports, but could never find
> > this publically.
> 
> The best public collection I've been able to find is [1] which is all the
> fuzzed captures that have ever caused the fuzz-bot to fail. It's worth
> noting that the vast majority of the remaining menagerie (~90% at a rough
> guess) is harvested from Bugzilla attachments, so most of the individual
> captures are already public, they're just not easily browseable.

5 GiB of non-specific captures, I think I'll pass for now on this.
 
> [1] http://www.wireshark.org/download/automated/captures/
> 
> > > Additionally, it is not indexed. There is a script somewhere to use
> > > tshark
> > > to extract the protocols contained in each capture and build a list, but
> > > it only works for protocols which are dissectible by default (no "decode
> > > as", decryption, or other special settings usually).
> > > 
> > > One of the ideas floated at sharkfest this year was the possibility of a
> > > proper interface to the menagerie, but I don't think anything really
> > > came
> > > of it. What protocol are you interested in right now?
> > 
> > There is no particular protocol I am interested at, it was an idea to
> > improve
> > regression testing. Right now I am looking at all dissectors below TCP (or
> > on
> > top, depending on how you look at it).
> > 
> > 
> > By the way, could I get delete permissions for attachments for the
> > SampleCaptures page on the wiki?
> 
> I think Gerald has to grant this.

Gerald, could I get delete privileges for the captures on the SampleCaptures 
page?

> > There are a bunch of duplicates (and even
> > some empty files) listed as attachment and not linked. Some are not even
> > captures files although their extension suggest so.
> > 
> > Empty files:
> > mount-de.pcap.gz
> > omron-test-csum.pcap
> > wireshark.org.pcap.gz
> > 
> > Not pcap (but tcpdump text output or even a media file):
> > packetout.pcap
> > RTSP.pcap
> > 
> > Duplicates can be found with:
> > md5sum * | sort | uniq -w32 -D | while read sum file; do echo $sum $(date
> > +"%Y-%m-%d %H:%M" -r "$file") "$(du -hD "$file")"; done
> > 
> > Are there known efforts to index the files? I don't think that the wiki is
> > a
> > sustainable way to collect them?
> 
> No efforts I know of, but I agree the current method isn't scaling.
> 
> The script I mentioned to get the list of protocols in one (or more)
> capture files is in git as ./tools/list_protos_in_cap.sh. Pipe it to a text
> file and then grep for the protocols you're looking for.

Thanks for the pointers, maybe it is already sufficient for my purposes of 
validation.

Kind regards,
Peter
https://lekensteyn.nl