Ethereal-dev: Re: [Ethereal-dev] Performance. Ethereal is slow when using large captures.

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Richard Sharpe <rsharpe@xxxxxxxxxxxxxxxxx>
Date: Thu, 13 Nov 2003 21:55:37 -0800 (PST)
On Fri, 14 Nov 2003, Ronnie Sahlberg wrote:

> Ethereal is very slow when working with very large captures containing
> hundred of thousands of packets.
> Every operation we do that makes ethereal rescan teh capture file taks a
> long time.

Yup. I have several 300-400+MB captures of NetBench traces etc, and things 
can get really bad :-(
 
> I would like to start looking at methods to make ethereal faster, especially
> for large captures trading a larger memory footprint
> for faster refiltering speeds but would like to know if there are others
> interested in helping.
> (you dont have to be a programmer,  someone that knows how to run gprof and
> read the data would be very very helpful as well)

Well, I am a programmer, and, within the constraints of the little 
remaining time I have, I would like to help work on this.
 
> If there is interest in this, I propose we take 0.9.16 as baseline, and
> measure the execution speed and memory footprint for ethereal
> for this version to start with.

Sounds good.

> A good example capture might be   setting snaplen to 100 and downloading
> ethereal from the website over and over until the capture file has 100.000
> packets (10MB).

Well, I can supply some fraction of my NetBench captures :-)

> Create three color filters : "ip" "tcp" and "frame.pkt_len>80"   (random
> filter strings from the top of my head)
> Open the UPD and the TCP conversation list.
> Load the capture into ethereal.
> 
> then measure the time it takes to refilter the capture as fast as possible
> 10 times using the display filter "tcp.port==80"
> After the 10th refilter of the trace, measure the memory footprint of
> ethereal using ps.
> 
> Repeat this using a gzipped copy of the capture file.
> This could be the baseline.
> (the time to perform the test should be a couple of minutes at least so that
> we can see any differences in test implementations easier)
> 
> 
> After this it would be good if someone did this test and collected gprof
> data to see where most of the cpu is spent.
> I guess but am not sure it is in the actual dissection of the packet.
> It would be great to have this verified.
> Everything below this line assumes the theory that the dissection of packets
> are taking up a significant part of the time to refilter a packet.
> 
> 
> Hypothesis:  a lot of time is spent redissecting the packets over and over
> everytime we refilter the trace.
> Possible Solution: reduce the number of times we redissect the packets.
> (We would need a preference option called "Faster refiltering at the expense
> of memory footprint". Memory is cheap but CPU power is finite)
> 
> currently:
> Everytime we rescan a capture file we loop over every single packet and:
> dissect the packet completely and build an edt tree (in file.c)
> prune the edt tree to only contain those fields that are part of any of the
> filters we use (display filter, color filter, the filter used by taps etc)
> evaluate each filter one at a time using the now pruned edt tree.
> discard the edt tree and go to the next packet.
> 
> possible optimization:
> When we dissect the packet the first time,  keep the full edt tree around
> and never prune it.
> The loop would then be:
> Loop over all the packets:
>    For this packet, find the unpruned edt tree (dont dissect the packet at
> all)
>    apply all filters one at a time on the unpruned edt tree.
> 
> This would at the expense of keeping the potentially very memory consuming
> edt tree around eliminate the need to redissect the packet.
> The memory required by the edt tree could probably be optimized over time.
> The risk is,  how will the time it takes to evaluate a filter be affected by
> the tree not being pruned? What is the execution time compared to number
> of fields held in the edt tree ?  Ordo(n) or Ordo(n^2) ?  Maybe we would
> have to do major brainsurgery there as well?
> Maybe it turns out to be unfeasible?   Who knows until we have tried.
> Still, SOME things we do would require a redissect of the packet anyway such
> as when clicking on the packet in the packet list but that is not
> performance critical.
> 
> 
> Can anyone see any obvious flaws in the reasoning above which would make it
> unfeasible or the approach irrelevant?
> 
> 
> Would anyone that knows gprof be interesting is doing the measurements on
> the "baseline" as suggested? and keep testing the same sample file
> for all further performacne tests we do so we know if we are going in the
> right direction or not?
> 
> 
> _______________________________________________
> Ethereal-dev mailing list
> Ethereal-dev@xxxxxxxxxxxx
> http://www.ethereal.com/mailman/listinfo/ethereal-dev
> 

-- 
Regards
-----
Richard Sharpe, rsharpe[at]ns.aus.com, rsharpe[at]samba.org, 
sharpe[at]ethereal.com, http://www.richardsharpe.com