Wireshark-users: Re: [Wireshark-users] Display dumpcap in real time
From: Chip <jeffschips@xxxxxxxxx>
Date: Tue, 01 Nov 2011 16:51:32 -0400
On 11/1/2011 4:38 PM, Guy Harris wrote:
On Nov 1, 2011, at 12:52 PM, Chip wrote:

 From what you wrote I gather you mean that because of privileges, dumpcap cannot write to a monitor, in real time, the content (whether hex, ascii or text) of the captured packets?
I mean that, because dumpcap might run with special privileges, we will never ever ever ever ever ever ever ever ever ever ever put any dissecting code into it, so that it will never ever ever... be able to display the dissected contents of the captured packets as text the way TShark, for example, does.  (And even in those cases where it doesn't run with special privileges or can give them up before running the dissecting code, burdening dumpcap with dissecting code would bloat it up in ways not useful for its main purposes, namely

	1) acting as a separate capture process for Wireshark, so that capturing and the user interface run in separate threads of control;

	2) acting as a stand-alone capture process in the cases where somebody *only* wants the packets dumped into a file for later analysis;

	3) acting as a separate capture process for Wireshark and TShark on platforms where special privileges are required for capture.)

It currently has no option to just dump the raw bytes of those packets as hex or ASCII, either.  There's probably no strong security reason not to do that, as the code to dump the raw bytes is simpler, but it does mean adding more stuff.

To clear up what I said about "only want to capture the ip addresses of the two endpoints", I mean I only want to collect the endpoint ip addresses along with time stamp -- not interested in any other data that dumpcap may collect.
That is not the sort of thing that a capture filter can do.  A capture filter selects which packets to capture, not how much of the packet to capture.

The snapshot length can be used to trim the captured packets to a fixed maximum length; it's specified with the -s command-line option (and, in Wireshark, with an item in the Capture Options dialog).  If you set the capture length to a value that captures the largest link-layer header and the largest IP header you'd expect, that would give you the IP addresses and everything before that - the snapshot length only causes stuff to be cut off at the end of the packet, so there is no way to, for example, capture only the IP addresses and not the link-layer headers or previous data.

The purpose of the project is to capture the domain names of visited websites whilst using a passive tap inline, so it is impossible to determine beforehand the sites that will be visited by users.  We are not interested in ftp, smtp, etc., traffic and only interested in the ip addresses of the domains visited for websites.  I understand that Tshark may have a more robust filtering schema
Well, it supports display filters, but the stuff I was discussing was its capability for dissectig packet data (atop which the display filter mechanism is built).  "Filter", in the context of *Shark and dumpcap, refers very narrowly to the ability to select which packets to capture or display; it does not refer to any ability to analyze a packet in detail or to display selected bits of information from that dissection.

but dumpcap uses less memory and this project will be collecting over many days so Tshark with it's larger memory footprint may not be as good as dumpcap.
If you want to see the packets dissected and the dissection displayed as text, at least some of that memory is, alas, going to be necessary.  If the packets are cut short with a snapshot length, the good news is that TShark will not be able to do very much dissection, so it will probably consume less memory.  The bad news is that it'll throw an exception on most if not all packets when it runs into the end of the captured data, which might take a bit more CPU.

If you really *only* want the IP addresses, the best strategy might be to write your own program using libpcap, and have it do an absolute minimum amount of dissection - capture with a capture filter that specifies only IP packets and perhaps only captures packets to and from the ports in which you're interested (for HTTP, for example, capture port 80, port 443 for HTTP-over-SSL, and maybe some other ports), specify a snapshot length so all that gets handed to your program is the link-layer header and the IP header up and including the addresses, and just dump out the time stamps and addresses, in your own file format, to the output file.

This program could be written in a number of programming languages - C, C++, Java, Perl, Python, Ruby, Common Lisp, various .NET languages, etc..

Another possibility might be to use tcpdump, which does less-fancy dissection that would be sufficient for your purposes; however, tcpdump doesn't support dissecting *and* saving raw packet data to a file, so if you need that, you can't do that with tcpdump.  If all you care about is time stamps and IP addresses, however, you may not need the raw packet data.  (If you do end up using tcpdump, run it with the -S flag, so that it doesn't consume memory keeping track of TCP connections to provide relative sequence numbers.)
Okay great Guy, that's perfectly clear now. I think I will go with tcpdump -S as really I am only caring about the connection information and not raw packet information.

Do you think tcpdump can hold up to running for hours capturing connection information without crashing a system because of memory usage? In tcpdump can one a ring buffer feature like in dumpcap?

Thank you.