Wireshark-dev: Re: [Wireshark-dev] Note about proto_tree_add_unicode_string (r43379)
Le 19/06/2012 21:14, Pascal Quantin a écrit :
> Le 19/06/2012 21:01, Jakub Zawadzki a écrit :
>> Hi,
>>
>> String from tvb_get_ephemeral_string() still needs escaping with format_text(),
>> cause it doesn't check encoding.
>>
>> When you use:
>> tvb_get_ephemeral_string_enc(tvb, offset, length, ENC_UTF_8 | ENC_NA);
>>
>> It guarantees result encoded in UTF-8:
>> * string as converted from the appropriate encoding to UTF-8 ...
>>
>> (Code to do it is still in XXX's but this is bug in libwireshark and no one can blame you that you used wrong function :))
> Hi,
>
> thanks for the hint (and for adding proto_tree_add_unicode_string :) ).
> Still I probably miss something but when looking at the code for
> tvb_get_ephemeral_string_enc, I see:
> case ENC_ASCII:
> default:
> /*
> * For now, we treat bogus values as meaning
> * "ASCII" rather than reporting an error,
> * for the benefit of old dissectors written
> * when the last argument to proto_tree_add_item()
> * was a gboolean for the byte order, not an
> * encoding value, and passed non-zero values
> * other than TRUE to mean "little-endian".
> *
> * XXX - should map all octets with the 8th bit
> * not set to a "substitute" UTF-8 character.
> */
> strbuf = tvb_get_ephemeral_string(tvb, offset, length);
> break;
>
> case ENC_UTF_8:
> /*
> * XXX - should map all invalid UTF-8 sequences
> * to a "substitute" UTF-8 character.
> */
> strbuf = tvb_get_ephemeral_string(tvb, offset, length);
> break;
>
> Do you mean we should already start using tvb_get_ephemeral_string_enc
> to continue working once the check for the ASCII 8th bit will be in place?
>
> Regards,
> Pascal.
Forget about it, I just saw your sentence in parenthesis :)
Regards,
Pascal.