[Toybox] diff algorithms

enh enh at google.com
Thu Aug 12 14:27:38 PDT 2021


you know how you (rob) have repeatedly expressed your desire to have a
different diff implementation, and i've always either ignored you or
claimed that the existing one is good enough?

well ... i finally hit a case where i can tell the difference. it turns out
that if you have 3 million lines in the files you're diffing, GNU diff can
get through that in less than 10s, busybox takes just under an hour (!),
and toybox takes just over an hour.

i'm assuming you already knew of cases like this, but i'll keep my two
125MiB files somewhere just in case. they compress pretty well, being
_very_ repetitive ASCII, but the zip file is still 11MiB so i won't post it
without being asked.

i've added this to my AOSP notes so i stop trying to talk you out of
switching to a better algorithm :-)

for now though, my _real_ problem is that i have diffs between my two 3
million line files... time to look at that!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20210812/55c593a1/attachment.html>


More information about the Toybox mailing list