Wireshark-users: Re: [Wireshark-users] Get all the duplicate packets
From: Jim Young <jyoung@xxxxxxx>
Date: Thu, 20 Sep 2012 03:38:02 +0000
Hello Boaz, On 9/19/12 4:19 PM, "Boaz Galil" <boaz20@xxxxxxxxx> wrote: > Editcap -d will remove all the duplicates! I actually want > to find all the duplicate packets.... <snip> Assuming that you are looking at frame level duplicates there's a couple ways of determining which frames may be duplicates. This involves displaying the MD5 hash for each frame. NOTE: The MD5 hash technique described below will NOT work for L3 level duplicates one might see on a one-armed router interface where each packet might be seen twice, ingressing on one vlan tag and egressing on another. NOTE2: Using the MD5 hash technique, its possible (though very unlikely) to have a false postive where two unrelated packets generate the exact same MD5 hash value. Assuming that the Wireshark preference frame.generate_md5_hash is TRUE (see note towards bottom of this message on how to check and set) then the following tshark command line can be used to generate a potentially large display filter for duplicate packets. echo "+++The filter ..." echo $(tshark -r MYFILE.PCAP -Tfields -e frame.md5_hash \ | sort \ | uniq -c \ | sort -n -r \ | grep -v ' 1 ' \ | awk 'BEGIN {printf "frame.number==0"} \ {printf "||frame.md5_hash=="$2} END {print ""}') The constructed display filter starts with the do-nothing clause "frame.number==0" so that the first "||" has something to its left. The command line above can be augmented to actually display the duplicate frames by invoking tshark twice. echo "+++The command to display the duplicates..." tshark -r MYFILE.PCAP \ $(tshark -r MYFILE.PCAP -Tfields -e frame.md5_hash \ | sort \ | uniq -c \ | sort -n -r \ | grep -v ' 1 ' \ | awk 'BEGIN {printf "frame.number==0"} \ {printf "||frame.md5_hash=="$2} END {print ""}') Or the command line can be modified to save the duplicates to a new pcap file (MYDUPLICATES.PCAP). Again this involves invoking tshark twice. echo "+++The command to save the duplicates..." tshark -w MYDUPLICATES.PCAP -r MYFILE.PCAP \ $(tshark -r MYFILE.PCAP -Tfields -e frame.md5_hash \ | sort \ | uniq -c \ | sort -n -r \ | grep -v ' 1 ' \ | awk 'BEGIN {printf "frame.number==0"} \ {printf "||frame.md5_hash=="$2} END {print ""}') FWIW: In addition to using tshark, you can also use editcap to display the frame MD5 hashes for each frame. Here's an MD5 hash example using editcap: > $ editcap -v -D 0 MYFILE.PCAP /dev/null > File MYFILE.pcap is a Wireshark - pcapng capture file. > Packet: 1, Len: 42, MD5 Hash: dc69bb2da069731e40367bed2cb44d56 > Packet: 2, Len: 42, MD5 Hash: dc69bb2da069731e40367bed2cb44d56 > Packet: 3, Len: 42, MD5 Hash: dc69bb2da069731e40367bed2cb44d56 > <snip> And here's the MD5 example using tshark: > $ tshark -o -o frame.generate_md5_hash:TRUE -r MYFILE.PCAP -Tfields -e >frame.md5_hash > dc69bb2da069731e40367bed2cb44d56 > dc69bb2da069731e40367bed2cb44d56 > dc69bb2da069731e40367bed2cb44d56 <snip> While the tshark report is cleaner (1 column versus 7), both the editcap and tshark output can then be post processed to extract the same counts of any duplicate MD5 hashes. I believe using editcap to generate the MD5 hashes is faster than using tshark if processing large trace files. The examples below illustrate how both editcap and tshark can be used to generate virtually identical list of any duplicate MD5 hashes. #1 - Using editcap: > $ editcap -v -D 0 MYFILE.PCAP /dev/null \ > | grep Hash: \ > | awk '{ print $7 }' \ > | sort \ > | uniq -c \ > | grep -v ' 1 ' \ > File MYFILE.pcap is a Wireshark - pcapng capture file. > 18 198e273fe9792cbf54919701db49b9cf > 12 1e848f674c60a07d23f7104b8a205a1c > 4 28c92df42bbf9c94a93560a5fb3decf0 > 2 3aabbf2969b96da88ee9b5937345eb75 > 6 636c43db7e87aa86c0afaf479ded30cf > 4 67a1a4f23bf565d2ab946955a0dc4b70 > 3 6e30d01d335343eed4dca273d95d6347 > 24 8d7780d026fb1d883717a6957abf2476 > 12 92063b2f67c0246413959046bf455c26 > 3 dc69bb2da069731e40367bed2cb44d56 > 2 e7177c946c4638b72fc62fe05bc5e30a > 9 fdaf0bcb2fe45420232fdd990c4fa655 > $ #2 - Using tshark: > $ tshark -r MYFILE.PCAP -Tfields -e frame.md5_hash \ > | sort \ > | uniq -c \ > | sort -n -r \ > | grep -v ' 1 ' > 18 198e273fe9792cbf54919701db49b9cf > 12 1e848f674c60a07d23f7104b8a205a1c > 4 28c92df42bbf9c94a93560a5fb3decf0 > 2 3aabbf2969b96da88ee9b5937345eb75 > 6 636c43db7e87aa86c0afaf479ded30cf > 4 67a1a4f23bf565d2ab946955a0dc4b70 > 3 6e30d01d335343eed4dca273d95d6347 > 24 8d7780d026fb1d883717a6957abf2476 > 12 92063b2f67c0246413959046bf455c26 > 3 dc69bb2da069731e40367bed2cb44d56 > 2 e7177c946c4638b72fc62fe05bc5e30a > 9 fdaf0bcb2fe45420232fdd990c4fa655 > $ NOTE: For the tshark MD5 hash pipelines to work the Wireshark preference "frame.generate_md5_hash" must be enabled. You can easily determine if the frame.generate_md5_hash preference is enabled using the following tshark pipeline: > $ tshark -G currentprefs | grep frame.generate_md5_hash > frame.generate_md5_hash: TRUE > $ If MD5 hashes are disabled (which I believe is the default) then it can be manually enabled on the tshark command line using tshark's -o option: -o frame.generate_md5_hash:TRUE That would make the tshark command line that saved the packets to a new file look like: tshark -o frame.generate_md5_hash:TRUE \ -w MYDUPLICATES.PCAP -r MYFILE.PCAP \ $(tshark -o frame.generate_md5_hash:TRUE \ -r MYFILE.PCAP -Tfields -e frame.md5_hash \ <snip> But its probably easier to just permanently enable MD5 hashes within Wireshark's preference file so that you don't have to remember to use the tshark -o frame.generate_md5_hash:TRUE option. Hope this helps, Jim Y.
- References:
- Re: [Wireshark-users] Get all the duplicate packets
- From: Boaz Galil
- Re: [Wireshark-users] Get all the duplicate packets
- Prev by Date: Re: [Wireshark-users] Get all the duplicate packets
- Next by Date: [Wireshark-users] why is "Wireless setting" grayed out in wireshark?
- Previous by thread: Re: [Wireshark-users] Get all the duplicate packets
- Next by thread: [Wireshark-users] why is "Wireless setting" grayed out in wireshark?
- Index(es):