Wireshark-dev: [Wireshark-dev] g_strsplit : glib1.2 vs glib2.0
From: Sebastien Tandel <sebastien@xxxxxxxxx>
Date: Thu, 14 Dec 2006 17:45:24 +0100
Hi all,

Glib1.2 has a buggy strsplit implementation ... it is known and described in
"Porting applications to the GNOME 2.0 platform" on the website of GNOME
(see
http://developer.gnome.org/dotplan/porting/ar01s19.html#glib-misc-changes)

Here is a practical example to see how both implementations differ :
array = g_strsplit(str, "-", 2); which tries to split str (see for each
case, the value of it) with the "-" delimiter for a max_tokens of 2.

1) ""     : { (null) }
2) " "    : { " ", (null) }

3) "a"    : { "a", (null) }
4) "a-"   : { "a", "" , (null) }
{ "a", (null)  } with glib1.2


5) "-b"   : { "", "b" , (null) }
6) "a-b"          : { "a", "b" , (null) }


7) "-a-"  : { "", "a-" , (null) }
{ "", "a" , (null) } with glib1.2

8) "-a-b"  : { "", "a-b" , (null) }
{ "", "a" , "b", (null) } with glib1.2

9) "-a-b-"  : { "", "a-b-" , (null) }
{ "", "a" , "b-", (null) } with glib1.2

The main difference is that within glib1.2 max_tokens is not
semantically correct as it represents in fact the max number of *split*
tokens.



I propose the following indications which can help to highlight some
problems in the code :
(potential disagree refers to the cases here above)

1) Test (array[0] == null)
Motivations : in case of string == "" (string of length == 0)
potential disagree : 1
2) (max_tokens != 0 && test on length(array))
array length may vary from one implementation of glib to the other
potential disagree : 4, 7, 8, 9
3) (max_tokens == 0 && test on length(array))
array length may vary from one implementation of glib to the other
potential disagree : 4


Do you agree with these cases? Is it sufficient to cover them all?



I made a quick review of all the parts of wireshark using g_strsplit.
I am using 4 different status for each file:line (identifying a place
where g_strsplit is used) :
- safe : *should* not have any bugs and no segfault
- seems to be clean : I cannot decide if it's really safe. :)
- buggy : *may* impact the code which follows
- unsafe : *may* have segfault


** airpcap_loader.c:2397        buggy
line 2485 => test on the limit number ... whatif an user enter x:y:z:
??? which results in a ssid == ""
worst whatif input_string=""? tested before ?
** epan/dissectors/packet-ieee80211.c:5162        buggy
** epan/dissectors/packet-jxta.c:964:  seem to be clean
** epan/dissectors/packet-http.c:1425:   buggy
test NULL values ... => what-if ":" ??? gives a false information not more
** epan/dissectors/packet-k12.c:157:  safe
** epan/dissectors/packet-k12.c:166:   safe
** epan/stats_tree.c:531: unsafe (patch pending)
** epan/ex-opt.c:44: seems to be clean
** gtk/font_utils.c:239: seems to be clean
** gtk/voip_calls.c:2203:       safe
** gtk/voip_calls.c:2226:       buggy
whatif ci(02/16/08/29, "?
record name "" ... don't know if it's a problem?
** gtk/webbrowser.c:220:  seems to be clean
** plugins/mate/mate_grammar.lemon:131:  safe
** plugins/mate/mate_setup.c:197:       buggy
can return a "" instead of NULL?
** plugins/mate/mate_setup.c:584:       safe
** plugins/mate/mate_util.c:958:        safe
** plugins/mate/mate_grammar.c:155:  safe
** tap-funnel.c:142: safe since it is no more used.
** util.c:148: unsafe
** util.c:156: unsafe
** util.c:167: unsafe



Note : some in wireshark test (array == null). it can't be null => do we get
rid of tests which look at it in wireshark? (cleanup of useless tests.
It's something I'm not doing everyday :))

Regards,
Sebastien Tandel