Ethereal-dev: Re: [Ethereal-dev] Memory allocation witchhunt??

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <gharris@xxxxxxxxx>
Date: Thu, 28 Apr 2005 13:27:42 -0700
Visser, Martin wrote:

(Curiously, if I try to exit Ethereal, the mem
usage climbs up to 200MB, and goes up and down swapping madly before
eventually I give and just kill the process.)
See below for one thing that I think causes this (destroying the widget 
that displays the list of packets requires it to free the strings it 
allocated for each column of each row, and that can take a while as it 
drags tons of stuff into the cache).
I know I could get more than my current 512MB RAM for not a whole lot of
money, but I guess one always has to stop somewhere. Also I know that
you can do a lot by simply streaming through tethereal. But does anyone
see any value in going on a memory witchhunt? I assume that memory is
mainly chewed up by the dissected structures. Are their any efficiencies
to be made here?
One thing that would, I suspect, make a significant difference would be 
a change to make the widget that displays the packet list not, when you 
add a row to the list, make copies of the text in all the columns, but 
instead calls back to a routine to get the contents of the columns.
I have a GTK+ widget, derived from the GTK+ 1.2.10 GtkClist widget, that 
does that; I haven't had a chance to work on it for a while, and don't 
remember what more needs to be done on the widget - other than making it 
work on GTK+ 2.x as well.  (There is such a widget in GTK+ 2.x; however, 
I think it might be significantly slower than the GtkClist widget in 
other ways.)
The best way to use such a widget is to have the callback routine read 
and dissect the packet in question, generating the column values (but 
not the protocol tree).  *If* random access to a capture file is 
reasonably efficient, that would *probably* work reasonably well when 
scrolling the packet list (especially if rapid scrolling causes the 
toolkit to compress updates, so that if a rapid scroll turns into a 
bunch of jumps, we don't try to dissect *all* packets that are nominally 
being dragged into view because they're not actually dragged into view).
Unfortunately, currently, although random access to an uncompressed file 
should be reasonably efficient, random access to a gzipped capture file 
is extremely inefficient.  I think it can be made efficient - somebody 
at NetApp who'd implemented code to read compressed core dump files for 
NetApp appliances said you can just save in memory the state of the 
decompression engine at various "checkpoints", so that to go to an 
arbitrary place in the file you go to the checkpoint before that place 
that's closest to that place and then move forward - but that hasn't 
been done yet.  (Doing so means that we couldn't use zlib to read 
compressed files, as I don't think it has a way to get that 
decompression engine information and set the state of the decompression 
engine based on stored information.  If we do it ourselves, that would 
simplify the configuration script code that currently copes with testing 
for zlib and for various zlib deficiencies and bugs, it'd mean that 
Ethereal would always support compressed files, and it means we could 
suppress the checksum checking when reading compressed files from, I 
think, Windows Sniffer - or maybe it was Shomiti Surveyor - which 
doesn't write out the standard gzip checksum at the end of the file.)
This might also probably mean that sorting by a column value could be 
*really* slow, depending on how many comparisons are done by the sorting 
process, as each comparison might have to generate dissect both packets. 
 There are probably ways of making that faster.
Another way to use such a widget, which wouldn't save as much memory, 
but would avoid the need to change the way we handle compressed files 
and column sorting, would be to, when generating the packet list, save 
the Protocol and Info column values as strings, and either save the 
address column values as strings or save addresses in a data structure 
(so that only one copy of each address seen is stored, with the 
structure also having a pointer to a resolved name) and store pointers 
to the address in question in the frame_data structure.  (That might 
also provide infrastructure for saving and restoring address-to-name 
tables.)
I also notice that when you say run protocol hierarchy stats, you still
have to run through all the dissectors again anyway, so is some of the
stored info wasted anyway?
It shouldn't be.  If the state already exists, dissectors shouldn't be 
re-generating it; most if not all don't re-generate it.  If a 
re-dissection is done (e.g., after changing a protocol preference), the 
state should be discarded and re-generated, as it might change.
I know that Richard Sharpe (and maybe others) occasionally run Ethereal
through a profiler to look for CPU hogs. I guess I wonder if (and how )
there should also be memory profiling done as well?
Just checking for leaks would help.  There's a "leaks" tool in OS X; I 
did some checking for leaks, found a few, and plugged a few, but didn't 
have time to go further.
"leaks" can't, as far as I know, report on non-leaked memory, but there 
might be tools to do that.  It'd definitely be worth doing.