Wireshark-dev: Re: [Wireshark-dev] Python bindings for wireshark
From: Lukas Lueg <lukas.lueg@xxxxxxxxx>
Date: Sun, 26 Jan 2014 20:10:34 +0100
I'll look into Pyreshark, writing custom dissectors using Wirepy is a probably a much wanted usecase.
It is important to note that Wirepy is not a plugin to run inside Wireshark but a wrapper for all of Wireshark's functionality. Wireshark doesn't run Wirepy, Wirepy runs libwireshark. This seems to be the only direction worth following to get all the desired features for Python. Using CFFI as a first layer of abstraction we can compatibility with PyPy (which Pyreshark and Wireshark's own original interface lack) and save around 12.000 lines of C-code (which the interface auto-generates).
The downside of a direct interface to libwireshark/libwiretap/etc. is that the abstraction layer feels c-ish, because libwireshark is not thought of as a library used by other code than the Wireshark GUI and probably hard to maintain. One also can't write dissectors using Wirepy that can run in Wireshark or tshark as of now. It should be quite easily possible however to provide a loader plugin that can load Wirepy which in turn provides it's own dissectors and other stuff.
There is a incomplete ToDo-list for these things, together with the preliminary docs, at http://wirepy.readthedocs.org/
I originally re-implemented what other Wireshark-Python-interfaces do to get hold of the provided information (e.g. running tshark and parsing the XML coming out of it). The performance however is - as one might expect - abysmal. Other - pure python - libraries like Scapy drown in gazillions of calls to struct.unpack(), just to find out that the packet is of no interest.
The "cube"-example (https://github.com/lukaslueg/wirepy/blob/master/examples/create_cube_events.py) demonstrates what I need Wirepy for: Dumping information into a database for realtime monitoring of network events. It uses multiple cpu cores (through the multiprocessing module) and can handle multiples of hundrets of mbits of network traffic without falling behind.
Display-filters also turn out to be a great tool to quickly get information about a packets content. One may compile as many display filters as needed and inspect a packet's content without ever having to manually dive down into the protocol tree. As the display-filter is executed using libwireshark's own VM, their performance is a) *far* better than what you can ever get of Python code and b) saves a ton of Python code which probably incorrectly handles this ASN1-string over there anyway.
It is important to note that Wirepy is not a plugin to run inside Wireshark but a wrapper for all of Wireshark's functionality. Wireshark doesn't run Wirepy, Wirepy runs libwireshark. This seems to be the only direction worth following to get all the desired features for Python. Using CFFI as a first layer of abstraction we can compatibility with PyPy (which Pyreshark and Wireshark's own original interface lack) and save around 12.000 lines of C-code (which the interface auto-generates).
The downside of a direct interface to libwireshark/libwiretap/etc. is that the abstraction layer feels c-ish, because libwireshark is not thought of as a library used by other code than the Wireshark GUI and probably hard to maintain. One also can't write dissectors using Wirepy that can run in Wireshark or tshark as of now. It should be quite easily possible however to provide a loader plugin that can load Wirepy which in turn provides it's own dissectors and other stuff.
There is a incomplete ToDo-list for these things, together with the preliminary docs, at http://wirepy.readthedocs.org/
I originally re-implemented what other Wireshark-Python-interfaces do to get hold of the provided information (e.g. running tshark and parsing the XML coming out of it). The performance however is - as one might expect - abysmal. Other - pure python - libraries like Scapy drown in gazillions of calls to struct.unpack(), just to find out that the packet is of no interest.
The "cube"-example (https://github.com/lukaslueg/wirepy/blob/master/examples/create_cube_events.py) demonstrates what I need Wirepy for: Dumping information into a database for realtime monitoring of network events. It uses multiple cpu cores (through the multiprocessing module) and can handle multiples of hundrets of mbits of network traffic without falling behind.
Display-filters also turn out to be a great tool to quickly get information about a packets content. One may compile as many display filters as needed and inspect a packet's content without ever having to manually dive down into the protocol tree. As the display-filter is executed using libwireshark's own VM, their performance is a) *far* better than what you can ever get of Python code and b) saves a ton of Python code which probably incorrectly handles this ASN1-string over there anyway.
2014-01-26 Evan Huus <eapache@xxxxxxxxx>
Sounds neat! You should probably be aware of Pyreshark [1] if you
aren't already. It provides an interface for writing dissectors in
python and hooking them into the main engine, so I believe it's
complementary to your work. It may be worth collaborating with the
author, or even merging the two projects to provide a single unified
python API.
Evan
[1] https://code.google.com/p/pyreshark/
P.S. As a general comment to the list, we really ought to remove the
old python bindings from trunk since they are terribly out of date and
buggy at this point. Last time this came up it turned out some
packager (redhat?) was still using them so we left them in, but I
think they're probably doing more harm than good at this point...
> ___________________________________________________________________________
On Sun, Jan 26, 2014 at 12:42 PM, Lukas Lueg <lukas.lueg@xxxxxxxxx> wrote:
> Hi,
>
> given the dark abyss that packet dissection libraries available to Python
> are, I've just started a library to make the code beneath Wireshark's GUI
> available to Python. Wirepy is a foreign function interface to use Wireshark
> within Python as implemented by CPython and PyPy.
>
> Working with dumpcap, wiretap, dissection of packets to protocol-trees and
> columns is usable but most of the more fine-grained functionality is not yet
> implemented. Also, a more pythonic API needs to be created atop the FFI.
>
> While valgrind shows that about 35% of cpu time is spent in the Python
> interpreter, a single of my laptop's cores can handle about 100mbit of
> traffic per second - not bad.
>
> The code just matured to it's own git repo and now lives at
> https://github.com/lukaslueg/wirepy
>
> I'd be grateful for comments, passing the word, and contributions.
>
> Best regards
> Lukas
>
> Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
> Archives: http://www.wireshark.org/lists/wireshark-dev
> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
> mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
___________________________________________________________________________
Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
Archives: http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
- References:
- [Wireshark-dev] Python bindings for wireshark
- From: Lukas Lueg
- Re: [Wireshark-dev] Python bindings for wireshark
- From: Evan Huus
- [Wireshark-dev] Python bindings for wireshark
- Prev by Date: Re: [Wireshark-dev] Python bindings for wireshark
- Next by Date: Re: [Wireshark-dev] tvb_get_string_enc() doesn't always return valid UTF-8
- Previous by thread: Re: [Wireshark-dev] Python bindings for wireshark
- Next by thread: Re: [Wireshark-dev] Python bindings for wireshark
- Index(es):