Wireshark-dev: Re: [Wireshark-dev] [Wireshark-commits] rev 20491: /trunk/docbook/ /trunk/docboo
From: Sebastien Tandel <sebastien@xxxxxxxxx>
Date: Fri, 19 Jan 2007 11:14:22 +0100
just for the fun and to get an idea of how long these commands last ...
I made the tests three times (just extract one representative), each
followed by a sync and the computer has enough free memory to load the
file into it. Here are numbers for a file of 179M


time tr -d '\015' <file.dos >file.unix
real    0m1.323s
user    0m0.390s
sys     0m0.857s

time sed -e 's/^M$//' file.dos >file.unix
real    0m4.518s
user    0m3.458s
sys     0m0.856s

time perl  -pi -e 's/\012\015/\015/;' <file.dos >file.unix
real    0m5.100s
user    0m3.803s
sys     0m0.905s


Regexp is definitely not a good idea for dos2unix ... :)


Regards,

Sebastien Tandel


Sake Blok wrote:
> On Thu, Jan 18, 2007 at 04:20:40PM -0800, Guy Harris wrote:
>   
>> On Jan 18, 2007, at 4:08 PM, Sebastien Tandel wrote:
>>
>>     
>>>>  Is it safe to assume that dos2unix is available on a common UNIX  
>>>> developer machine?!?
>>>>         
>>> Nope, it is not ... :-/
>>>       
>> No, but
>>
>> 	tr -d '\015' <file_with_CR_LF_line_endings >file_with_LF_line_endings
>>
>> will probably be available.  (Unfortunately, "tr" isn't able to do  
>> unix2dos.)
>>     
>
> Since 'sed' is a requirement for a development-environment, it can also
> be used for EOL transformations:
>
> dos2unix: sed -e 's/^M$//'  <-- Use ctrl-v ctrl-m to create the ^M
> unix2dos: sed -e 's/$/^M/'  <-- Use ctrl-v ctrl-m to create the ^M
>
> Unfortunately "\r" is not recognized by sed on all platforms and 
> neither is \015. Therefore we have to use the raw CR character.
>
> Cheers,
>
>
> Sake
>
> PS  To prevent to over-dosizise the EOL's, you can also check whether
>     there is already a CR by using: sed -e 's/[^^M]$/^M/'
>
> _______________________________________________
> Wireshark-dev mailing list
> Wireshark-dev@xxxxxxxxxxxxx
> http://www.wireshark.org/mailman/listinfo/wireshark-dev
>