Wireshark-users: Re: [Wireshark-users] Filtering a very large capture file
From: "Stuart MacDonald" <stuartm@xxxxxxxxxxxxxxx>
Date: Fri, 26 Jan 2007 10:56:52 -0500
From: On Behalf Of Guy Harris
> On Jan 25, 2007, at 8:23 PM, Stuart MacDonald wrote:
> > I've read the man pages on the tools that come with Wireshark. I was
> > hoping to find a tool that opens a capture, applies a filter and
> > outputs matching packets to a new file. Here's a sample run of the
> > hypothetical filtercap tool:
> > # filtercap -r very-large.eth -w only-infrequent.eth -f  
> > "tcp.port==50000"
> 
> 	tcpdump -r very-large.eth -w only-infrequent.eth tcp port 50000

tcpdump isn't part of wireshark. Thanks though, I think I still have
the original capture on the original machine, and this would be faster
than the shell script solution elsethread.

> That can't do arbitrary display filtering, but truly *arbitrary*  
> display filtering has problems with reassembly (i.e., a filter that  
> matches something in the reassembled portion of the packet 
> can't match  
> anything but the last packet).  It also can't handle non-libpcap  

Fair enough. What exactly constitutes the reassembled portion? I'm
thinking it's things like the TCP analysis; "Zero Window" status etc.
<mulls> I guess it's anything that can't be expressed as a capture
filter.

Interesting. I've lcoated
http://wiki.wireshark.org/TCP_Reassembly
and those options are off (by default) for my Wireshark. Are they not
also off (by default) for tshark?

> > tshark is almost the right thing, except that tshark also tries to
> > read in the whole capture first instead of processing it 
> like editcap.
> 
> No, actually, it *does* process it like editcap; neither it nor  
> Wireshark read the entire capture file into memory.  They *do* keep  
> reassembled data in memory, but that's another matter.

Let me reprhase that then. tshark also bails with the out of memory
crash, just like Wireshark. editcap does not. I assumed that was due
to the method of processing the file, but I see now that it's due to
reassembly, and this is perhaps why editcap doesn't filter on anything
but frame numbers and time; it avoids reassembly by doing so.

Hm, the research on TCP Reassembly from above makes me think the
crashes are not due to reassembly after all. Is that a new bug in
Wireshark/tshark then?

Ah yes. tshark refuses to apply a capture filter when reading from a
file, thereby enforcing a display filter and the subsequent crash. I
suppose that it can't apply a capture filter because it's not using
libpcap to get the packets in the first place? Perhaps libpcap needs
to be taught how to use a file instead of an interface.

Is there a way to turn off reassembly so that tshark would work the
same as tcpdump in the above example? Although now it looks like it
should be off (by default).

..Stu