Wireshark-dev: Re: [Wireshark-dev] slow sip voip flow for large captures
From: Cristian Constantin <const.crist@xxxxxxxxxxxxxx>
Date: Fri, 3 Feb 2012 15:19:24 +0100
On Fri, Feb 3, 2012 at 3:02 PM, Anders Broman <anders.broman@xxxxxxxxxxxx> wrote: > Hi, > Forget about my comment about the sip hash table. > > Please add the patch(es) to bugzilla, if possible break it up to something we can commit straight away and stuff that needs to be worked on to work > For all taps as that can't be committed until it's done for all tap. Two bug reports? cristian: but this means really 2 different patches which I have to test separately, right? otoh: * the append operation on lists with over 10k elements is really cpu intensive (imo it does NOT actually make sense to iterate over 10k pointers just to get to the end of some linked list) * if one does NOT apply that patch (step 3), one cannot actually see the actual improvement. I will open one bug report and post the results/patches there. > > Regards > Anders > -----Original Message----- > From: wireshark-dev-bounces@xxxxxxxxxxxxx [mailto:wireshark-dev-bounces@xxxxxxxxxxxxx] On Behalf Of Cristian Constantin > Sent: den 3 februari 2012 14:54 > To: Developer support list for Wireshark > Subject: Re: [Wireshark-dev] slow sip voip flow for large captures > > On Fri, Feb 3, 2012 at 1:44 PM, Anders Broman <anders.broman@xxxxxxxxxxxx> wrote: >> Hi, >> Please put the patches in bugzilla, if possible split them in two >> parts one that's ok for all types(Step 2,3,4?) of calls and one which Implements the stuff for SIP but not for the other call types. > > cristian: step 1 is dealing only with sip packets. > > step 2's improvement could be used all voip taps. > > step 3 is tricky since the lists have to be reversed (trivial operation for double linked lists; glib offers an api for this). however currently the prepend and reversing is performed only for sip calls. extra-work to be done here for other voip taps to use more efficient linked lists operations (prepend and eventl. reverse). also, this step modifies the callback for the button "Flow" of the "VoIP" window - on_graph_bt_clicked() - to reverse (once) the lists: > > * voip_calls_get_info()->callsinfo_list > * voip_calls_get_info()->graph_analysis->list > > so, in order to work correctly with the other taps, they have to use the "prepend" operation/convention as well. > > step 4 could be used for all taps > > let me see what I can do about the patches. > >> I think the SIP dissector keeps a table over the calls as well could that table be shared? > > cristian: I do not know which table is that. > >> Regards >> Anders >> >> -----Original Message----- >> From: wireshark-dev-bounces@xxxxxxxxxxxxx >> [mailto:wireshark-dev-bounces@xxxxxxxxxxxxx] On Behalf Of Cristian >> Constantin >> Sent: den 3 februari 2012 13:01 >> To: Developer support list for Wireshark >> Subject: [Wireshark-dev] slow sip voip flow for large captures >> >> hi! >> >> wireshark can draw call flows for sip voip calls (accessible through the menu Telephony/VoIP Calls). >> >> however, when the capture is large, containing tens of thousands of sip voip calls, wireshark becomes very slow at producing the list of calls and the call flows. >> >> here are my experiences with a capture file of 500 MB containing approx. 40K calls on a debian machine having: >> >> 4xAMD Phenom(tm) II X4 940 Processor >> 4GB RAM >> >> with the svn stock it takes about 10 hours to do it; the report of the profiling done with oprofile looks like this: >> >> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted >> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit >> mask of 0x00 (No unit mask) count 750000 samples % image name >> app name symbol name 8146520 55.2011 lt-wireshark >> lt-wireshark >> append_to_frame_graph.isra.7 >> 3698478 25.0610 libglib-2.0.so.0.2800.6 lt-wireshark >> g_list_last >> 1566933 10.6176 lt-wireshark lt-wireshark >> SIPcalls_packet >> 1162377 7.8763 libc-2.13.so lt-wireshark >> __strcmp_sse2 >> 8734 0.0592 libwireshark.so.0.0.1 lt-wireshark >> serv_name_lookup 8130 0.0551 libwireshark.so.0.0.1 >> lt-wireshark dissect_sip_common >> 6504 0.0441 libwireshark.so.0.0.1 lt-wireshark >> format_text >> 5749 0.0390 libwireshark.so.0.0.1 lt-wireshark >> tap_push_tapped_queue >> 4753 0.0322 libglib-2.0.so.0.2800.6 lt-wireshark >> g_unichar_isprint >> 4627 0.0314 libglib-2.0.so.0.2800.6 lt-wireshark >> g_hash_table_lookup >> 3699 0.0251 libwireshark.so.0.0.1 lt-wireshark >> guint8_pbrk >> >> step 1 >> ------ >> >> by changing SIPcalls_packet() to use a hash table for the lookup of the calls (on sip call-id) the runtime decreases to an (estimated) 3-4 hours. >> SIPcalls_packet()'s cpu usage has decreased from 10.61% to 0.0115% >> (1000 times faster) >> >> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted >> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit >> mask of 0x00 (No unit mask) count 750000 samples % image name >> symbol name >> 411305 69.4523 lt-wireshark >> append_to_frame_graph.isra.7 >> 171549 28.9675 libglib-2.0.so.0.2800.6 g_list_last >> 475 0.0802 libwireshark.so.0.0.1 serv_name_lookup >> 439 0.0741 libwireshark.so.0.0.1 dissect_sip_common >> 375 0.0633 libwireshark.so.0.0.1 format_text 360 >> 0.0608 libwireshark.so.0.0.1 tap_push_tapped_queue >> 303 0.0512 libglib-2.0.so.0.2800.6 g_hash_table_lookup >> 283 0.0478 libglib-2.0.so.0.2800.6 g_unichar_isprint >> 202 0.0341 libwireshark.so.0.0.1 guint8_pbrk >> 172 0.0290 libwireshark.so.0.0.1 proto_tree_add_item >> 149 0.0252 libwireshark.so.0.0.1 compute_offset_length >> 143 0.0241 libc-2.13.so __memcpy_sse2 >> 141 0.0238 libwireshark.so.0.0.1 call_dissector_work >> 141 0.0238 libwireshark.so.0.0.1 emem_free_all >> 115 0.0194 libc-2.13.so __strncmp_sse2 >> 112 0.0189 libc-2.13.so vfprintf >> 105 0.0177 libwireshark.so.0.0.1 proto_item_add_subtree >> 104 0.0176 libglib-2.0.so.0.2800.6 >> g_slice_free_chain_with_offset >> 99 0.0167 libwireshark.so.0.0.1 dissect_sdp >> 91 0.0154 libc-2.13.so free >> 91 0.0154 libwireshark.so.0.0.1 dissect_ip 90 0.0152 >> libwireshark.so.0.0.1 proto_field_is_referenced 90 0.0152 >> libwireshark.so.0.0.1 tvb_length_remaining 90 0.0152 >> lt-wireshark process_specified_packets.constprop.10 >> 89 0.0150 libglib-2.0.so.0.2800.6 g_slice_free1 >> 89 0.0150 libwiretap.so.0.0.1 file_seek >> 88 0.0149 libwireshark.so.0.0.1 dissect >> 86 0.0145 libwireshark.so.0.0.1 dissect_frame >> 82 0.0138 libwireshark.so.0.0.1 proto_is_protocol_enabled >> 79 0.0133 libc-2.13.so __strcmp_sse2 >> 79 0.0133 libc-2.13.so _int_malloc >> 78 0.0132 libglib-2.0.so.0.2800.6 g_slice_alloc >> 75 0.0127 libpthread-2.13.so pthread_getspecific >> 73 0.0123 libwireshark.so.0.0.1 tvb_find_guint8 >> 72 0.0122 libc-2.13.so malloc >> 71 0.0120 libwireshark.so.0.0.1 fast_ensure_contiguous >> 68 0.0115 lt-wireshark SIPcalls_packet >> >> step 2 >> ------ >> >> the next improvment was to change append_to_frame_graph() to use a hash table for graph analysis (still slow, over 1 hour): >> >> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted >> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit >> mask of 0x00 (No unit mask) count 750000 samples % image name >> symbol name >> 565268 94.6185 libglib-2.0.so.0.2800.6 g_list_last >> 1863 0.3118 libwireshark.so.0.0.1 serv_name_lookup 1620 >> 0.2712 libwireshark.so.0.0.1 dissect_sip_common 1340 0.2243 >> libwireshark.so.0.0.1 format_text >> 987 0.1652 libglib-2.0.so.0.2800.6 g_unichar_isprint >> 982 0.1644 libglib-2.0.so.0.2800.6 g_hash_table_lookup >> 819 0.1371 libwireshark.so.0.0.1 tap_push_tapped_queue >> 731 0.1224 libwireshark.so.0.0.1 guint8_pbrk >> 628 0.1051 libwireshark.so.0.0.1 proto_tree_add_item >> 587 0.0983 libglib-2.0.so.0.2800.6 >> g_hash_table_insert_internal >> 574 0.0961 libwireshark.so.0.0.1 compute_offset_length >> 405 0.0678 libwireshark.so.0.0.1 proto_is_protocol_enabled >> 401 0.0671 libwireshark.so.0.0.1 emem_free_all >> 395 0.0661 libwireshark.so.0.0.1 dissect_ip >> 391 0.0654 libc-2.13.so __strncmp_sse2 >> 367 0.0614 libwireshark.so.0.0.1 dissect ... >> 20 0.0017 lt-wireshark >> append_to_frame_graph.isra.8 >> >> however, append_to_frame_graph() got from 69.45% to 0.0017% (40852 >> times better) >> >> step 3 >> ------ >> >> I have changed the graph analysis list and the callsinfo list to use prepend instead of append: >> >> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted >> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit >> mask of 0x00 (No unit mask) count 750000 samples % image name >> symbol name >> 130285 18.6875 libc-2.13.so msort_with_tmp >> 66464 9.5333 libglib-2.0.so.0.2800.6 node_get_pos 53790 >> 7.7154 libglib-2.0.so.0.2800.6 g_hash_table_insert_internal >> 51146 7.3361 libc-2.13.so __memcpy_sse2 >> 44419 6.3712 libglib-2.0.so.0.2800.6 node_get_next >> 42461 6.0904 libgtk-x11-2.0.so.0.2400.4 gtk_rbtree_reorder_fixup >> 41265 5.9189 libglib-2.0.so.0.2800.6 is_end >> 40819 5.8549 libglib-2.0.so.0.2800.6 g_hash_table_resize >> 38905 5.5803 libgtk-x11-2.0.so.0.2400.4 _gtk_rbtree_next >> 36069 5.1736 libgtk-x11-2.0.so.0.2400.4 _gtk_rbtree_reorder >> 35828 5.1390 libglib-2.0.so.0.2800.6 g_hash_table_lookup >> 29469 4.2269 libgtk-x11-2.0.so.0.2400.4 save_positions >> 26858 3.8524 libgtk-x11-2.0.so.0.2400.4 generate_order >> 7892 1.1320 libc-2.13.so __memset_sse2 >> 7752 1.1119 libgtk-x11-2.0.so.0.2400.4 >> gtk_rbtree_reorder_sort_func 6860 0.9840 >> libgtk-x11-2.0.so.0.2400.4 gtk_rbtree_reorder_invert_func >> 6835 0.9804 libglib-2.0.so.0.2800.6 >> g_hash_table_remove_all_nodes >> 4085 0.5859 libglib-2.0.so.0.2800.6 g_array_append_vals >> 3864 0.5542 libglib-2.0.so.0.2800.6 g_sequence_iter_is_end >> 3591 0.5151 libc-2.13.so __memset_x86_64 >> 2057 0.2950 libglib-2.0.so.0.2800.6 g_array_maybe_expand >> >> step 4 >> ------ >> >> reordering is performed _continously_ while adding elements to the voip calls list. >> (by _gtk_rbtree_reorder which is calling qsort from libc which is >> calling msort_with_tmp) >> >> after disabling continous reordering, by using in voip_calls_dlg_update(): >> >> /* disable the re-ordering */ >> gtk_tree_sortable_set_sort_column_id(sortable, >> GTK_TREE_SORTABLE_UNSORTED_SORT_COLUMN_ID, GTK_SORT_ASCENDING); >> >> listx = g_list_first(listx); >> while (listx) { >> add_to_list_store((voip_calls_info_t*)(listx->data)); >> listx = g_list_next(listx); >> } >> >> /* enable the re-ordering */ >> gtk_tree_sortable_set_sort_column_id(sortable, >> CALL_COL_START_TIME, GTK_SORT_ASCENDING); >> >> CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted >> CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit >> mask of 0x00 (No unit mask) count 750000 samples % image name >> symbol name >> 13495 8.8003 libwireshark.so.0.0.1 format_text >> 10698 6.9763 libglib-2.0.so.0.2800.6 g_unichar_isprint >> 7421 4.8394 libwireshark.so.0.0.1 guint8_pbrk >> 4973 3.2430 libwireshark.so.0.0.1 compute_offset_length >> 4776 3.1145 libglib-2.0.so.0.2800.6 g_hash_table_lookup >> 4029 2.6274 libwireshark.so.0.0.1 dissect_sip_common >> 3851 2.5113 libglib-2.0.so.0.2800.6 g_atomic_pointer_get >> 3186 2.0776 libc-2.13.so __memcpy_sse2 >> 3064 1.9981 libglib-2.0.so.0.2800.6 >> g_hash_table_insert_internal >> 3008 1.9616 libwireshark.so.0.0.1 emem_free_all >> 2964 1.9329 libwireshark.so.0.0.1 fast_ensure_contiguous >> 2803 1.8279 libwireshark.so.0.0.1 serv_name_lookup >> 2793 1.8214 libwireshark.so.0.0.1 >> check_offset_length_no_exception >> 2114 1.3786 libc-2.13.so _int_malloc >> 2063 1.3453 libglib-2.0.so.0.2800.6 g_slice_free1 >> 2046 1.3342 libwireshark.so.0.0.1 tap_push_tapped_queue >> 1841 1.2005 libwireshark.so.0.0.1 tvb_get_guint8 >> 1713 1.1171 libc-2.13.so memchr >> 1668 1.0877 libwireshark.so.0.0.1 tvb_length_remaining >> 1625 1.0597 libglib-2.0.so.0.2800.6 g_atomic_int_get 1550 >> 1.0108 libwireshark.so.0.0.1 proto_tree_add_item >> 1524 0.9938 libc-2.13.so vfprintf >> 1475 0.9619 libwireshark.so.0.0.1 emem_alloc_chunk >> 1439 0.9384 libc-2.13.so __strncmp_sse2 >> 1395 0.9097 libglib-2.0.so.0.2800.6 g_slice_alloc >> 1312 0.8556 libwireshark.so.0.0.1 tvb_find_guint8 >> 1279 0.8341 libwireshark.so.0.0.1 >> ensure_contiguous_no_exception >> 1052 0.6860 libc-2.13.so malloc >> 1028 0.6704 libwireshark.so.0.0.1 dissect_sdp >> 1022 0.6665 libwireshark.so.0.0.1 tvb_pbrk_guint8 >> >> the runtime decreased now to approx. 1 minute (pretty reasonable when >> compared to 10 hours) >> >> I can publish the patches on bugzilla; however, the changes/tests have been explicitly targeted at sip voip calls. >> some of the data structures are global and used by _all_ voip protocol taps; it won't work correctly with dumps having also non-sip voip calls. >> >> pls. let me know your opinion. >> >> thanks. >> cristian >> ______________________________________________________________________ >> _____ Sent via: Wireshark-dev mailing list >> <wireshark-dev@xxxxxxxxxxxxx> >> Archives: http://www.wireshark.org/lists/wireshark-dev >> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev >> >> mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe >> ______________________________________________________________________ >> _____ Sent via: Wireshark-dev mailing list >> <wireshark-dev@xxxxxxxxxxxxx> >> Archives: http://www.wireshark.org/lists/wireshark-dev >> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev >> >> mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe > ___________________________________________________________________________ > Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx> > Archives: http://www.wireshark.org/lists/wireshark-dev > Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev > mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe > ___________________________________________________________________________ > Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx> > Archives: http://www.wireshark.org/lists/wireshark-dev > Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev > mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
- Follow-Ups:
- Re: [Wireshark-dev] slow sip voip flow for large captures
- From: Anders Broman
- Re: [Wireshark-dev] slow sip voip flow for large captures
- References:
- [Wireshark-dev] slow sip voip flow for large captures
- From: Cristian Constantin
- Re: [Wireshark-dev] slow sip voip flow for large captures
- From: Anders Broman
- Re: [Wireshark-dev] slow sip voip flow for large captures
- From: Cristian Constantin
- Re: [Wireshark-dev] slow sip voip flow for large captures
- From: Anders Broman
- [Wireshark-dev] slow sip voip flow for large captures
- Prev by Date: Re: [Wireshark-dev] slow sip voip flow for large captures
- Next by Date: [Wireshark-dev] Meeting minutes from (pre)FOSDEM meeting
- Previous by thread: Re: [Wireshark-dev] slow sip voip flow for large captures
- Next by thread: Re: [Wireshark-dev] slow sip voip flow for large captures
- Index(es):