I create a GString str = “A{Dagger}B{Sigma}C”; (i.e. “\x41\xE2\x80\xA0\x42\xCE\xA3\x43” where \xE2\x80\xA0 is Dagger and \xCE\xA3 is Sigma).
The Dagger is the correct UTF-8 code (https://www.fileformat.info/info/unicode/char/2020/index.htm)
and the Sigma is the correct UTF-8 code (https://www.fileformat.info/info/unicode/char/03a3/index.htm).
I use col_append_lstr(pinfo->cinfo, COL_INFO, str, COL_ADD_LSTR_TERMINATOR);
The display is “A{Dagger}B{Sigma}C” where the {Dagger} and {Sigma} are the correct visual single characters.
I use proto_string_add_string(…, str);
The display is “A{Dagger}B{black-diamond-with-question-mark}{black-diamond-with-question-mark}C” where the {black-diamond-with-question-mark} is the visual single character of a black diamond with a question mark (and it is displayed twice).
So col_append_lstr handles UTF-8 and proto_string_add_string partially handles UTF-8.
How can I get a proto_string_* function that will display UTF-8 correctly like col_append_lstr does?
I do not need any string function to validate my UTF-8 bytes (if I make a mistake, that’s my problem). I just want a consistent display.
Environment:
Windows 10 Enterprise (10.0.18363) x64
Microsoft Visual Studio Community 2019 Version 16.7.1
QT v5.15.0 using msvc2019_64
Wireshark 3.3.0 with customer dissector
Wireshark Font Consolas Regular 12.0