Wireshark-dev: Re: [Wireshark-dev] Wireshark memory handling
From: Jeff Morriss <jeff.morriss.ws@xxxxxxxxx>
Date: Fri, 09 Oct 2009 10:43:12 -0400
Erlend Hamberg wrote:
On Friday 9. October 2009 03.47.16 didier wrote:
I don't see what you would get with mmaped files vs enough swap. But if
you are using wireshark, ie working interactively, it'd be slow, slow as
in unusable.

One advantage of using memory mapped files instead of swap is that if your OS is swapping, *everything* is slow. If only Wireshark is, er, swapping, only Wireshark is slow.

(It had been a while since I had experienced real swapping--then I left tshark running on a buildbot fuzz failure. After tshark used half my swap, it took a good 10-15 minutes for the process to die after I killed it--during which time my machine was completely useless.)

But we never use wireshark if it needs to hit harddisks (for us roughly
3 times the file size), it's too slow.

Too slow, full stop? Our experience in using disk-cached data in interactive programs is very limited, but our naïve assumptions were that it that data is sequential and the operating system's disk buffering system does its job, it should be possible to work with this solution. It is of course hard to put exact numbers on how fast something has to be, but if the speed dropped below a level where it's not possible to use the program interactively at all, this solution is of no use.

Give it a try.  Here's what I just did (for fun :-)):

Starting condition: a whole bunch of RAM free (forget how much, probably around 2-2.5 Gb), no swap in use.

Start wireshark loading a 7.6 million packet capture file.
Linux appears to keep free RAM in the ~20-30 Mb range, but swap usage grows to 1.5 Gb. Wireshark consumes 4120 Mb.

Scrolling around the packet list is actually not too slow now. But doing even something simple (that does not involve going through the packet list) like Statistics->Summary takes 1-2 minutes. (Normally the summary page is basically instantaneous.) I didn't even try filtering.

Using memory mapped files would probably help quite a bit with keeping the UI responsive because only Wireshark's, for example, packet data would be on disk but the executable pages and "core" memory like the statistics could be kept in RAM (or at least whatever the OS gives us).

One could imagine having different allocators (like ep_ and se_ today) that use malloc & friends (for "core" stuff) or go to the mmap'd pool (for packet data, etc.).