Andrew Yourtchenko said:
> I've recently had the need to filter out some data from the 3.7G
> sniffer-format capture, and
> before I figured out that gzipping that file could have done the trick
> as it would have been smaller than 2G (although still not sure of
> that),
Unfortunately, zlib is part of the problem, not part of the solution,
here. Seek offsets in zlib are z_off_t's, and those are often longs even
on platforms that support 64-bit file offsets - and those offsets are
offsets in the *uncompressed* data stream, not offsets in the compressed
file.
> By doing this I figured I might describe what I did and check with you
> whether this might be with some addition become a useful contribution
> to get the ">2G file" support or not in the ethereal (well, probably
> not the GUI-based ethereal, but the CLI utils might benefit)
>
> Also, change the call to fseek() to fseeko()
Not all platforms necessarily have "fseeko()". The configure script would
need to check for that and fall back on "fseek()" if it's not present.
> c) configure with:
>
> CFLAGS="-g -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE" ./configure
> --disable-ethereal --without-plugins --without-zlib
Are those flags that would suffice on *all* platforms that have "fseeko()"?
> Interop with zlib would obviously need some more work - I did not deal
> with it since I did not have the explicit need - but is this the only
> obstacle or there is something more fundamental that I have missed ?
It's probably the hardest obstacle to overcome.
Doing our own decompression, rather than using the gz routines, would let
us avoid that problem - and could also let us keep around, in memory,
enough compression dictionary information to support efficient random
access to a gzipped file (currently, seeking forward in a file is done by
skipping forward, and seeking backward is done by going back to the
beginning and skipping forward).