Erlend Hamberg wrote:
On Friday 9. October 2009 03.47.16 didier wrote:
I don't see what you would get with mmaped files vs enough swap. But if
you are using wireshark, ie working interactively, it'd be slow, slow as
in unusable.
One advantage of using memory mapped files instead of swap is that if
your OS is swapping, *everything* is slow. If only Wireshark is, er,
swapping, only Wireshark is slow.
(It had been a while since I had experienced real swapping--then I left
tshark running on a buildbot fuzz failure. After tshark used half my
swap, it took a good 10-15 minutes for the process to die after I killed
it--during which time my machine was completely useless.)
But we never use wireshark if it needs to hit harddisks (for us roughly
3 times the file size), it's too slow.
Too slow, full stop? Our experience in using disk-cached data in interactive
programs is very limited, but our naïve assumptions were that it that data is
sequential and the operating system's disk buffering system does its job, it
should be possible to work with this solution. It is of course hard to put
exact numbers on how fast something has to be, but if the speed dropped below
a level where it's not possible to use the program interactively at all, this
solution is of no use.
Give it a try. Here's what I just did (for fun :-)):
Starting condition: a whole bunch of RAM free (forget how much, probably
around 2-2.5 Gb), no swap in use.
Start wireshark loading a 7.6 million packet capture file.
Linux appears to keep free RAM in the ~20-30 Mb range, but swap usage
grows to 1.5 Gb. Wireshark consumes 4120 Mb.
Scrolling around the packet list is actually not too slow now. But
doing even something simple (that does not involve going through the
packet list) like Statistics->Summary takes 1-2 minutes. (Normally the
summary page is basically instantaneous.) I didn't even try filtering.
Using memory mapped files would probably help quite a bit with keeping
the UI responsive because only Wireshark's, for example, packet data
would be on disk but the executable pages and "core" memory like the
statistics could be kept in RAM (or at least whatever the OS gives us).
One could imagine having different allocators (like ep_ and se_ today)
that use malloc & friends (for "core" stuff) or go to the mmap'd pool
(for packet data, etc.).