[Toybox] diff algorithms

Rob Landley rob at landley.net
Fri Aug 13 03:30:37 PDT 2021


On 8/12/21 4:27 PM, enh via Toybox wrote:
> you know how you (rob) have repeatedly expressed your desire to have a different
> diff implementation, and i've always either ignored you or claimed that the
> existing one is good enough?
> 
> well ... i finally hit a case where i can tell the difference. it turns out that
> if you have 3 million lines in the files you're diffing, GNU diff can get
> through that in less than 10s, busybox takes just under an hour (!), and toybox
> takes just over an hour.
> 
> i'm assuming you already knew of cases like this, but i'll keep my two 125MiB
> files somewhere just in case. they compress pretty well, being _very_ repetitive
> ASCII, but the zip file is still 11MiB so i won't post it without being asked.

I'd love to get a copy of those just for personal development testing if I can.
(Running problematic real world data through the thing is always preferable.)

I'll add a TODO for figuring out how to reasonably have the test suite address
the issue without checking in an 11 megabyte test file. :)

Thanks,

Rob


More information about the Toybox mailing list