[Toybox] histogram diff

enh enh at google.com
Fri Jan 31 07:50:19 PST 2025


On Thu, Jan 30, 2025 at 7:01 PM Ray Gardner <raygard at gmail.com> wrote:
>
> > [patience diff] may have predated the "histogram" algorithm which is
> > surprisingly hard to google for...
> > ...
> > I was really hoping I could implement just ONE algorithm and call it good.
> > Right now it looks like "histogram" would be that one (it's an improvement
> > upon "patience" which is the one I'd planned to implement before somebody
> > else sent in a contribution). ...
>
> Yes, there was nothing on the Web explaining it well enough to implement it.
> I had to dig to figure it out.
>
> I have attached a patch to your "branch" diff.c that you've set aside for a
> few years. It replaces the hunky-dory code with a histogram diff
> implementation. Also implements diff "default" output as well as unified.
> It does not at present support -p, -B, -q, -s. I can work on them a bit if
> you are interested. I would leave -r to you, as well as color, label, etc.
>
> It does produce pretty decent diffs quickly, that match git's closely and
> jgit's exactly. I used it to produce the attached patch, of course. There is
> an explanation of the algo and code at the end after main(), and I've
> written a couple of posts about it at raygard.net. And have a working
> standalone version at github.com/raygard/hdiff/.

oh, wow, until reading your article i had no idea sop@ died in 2018...
i'd assumed he'd just been promoted to where he wasn't making code
changes any more :-(

(btw, your "more on" article might want to explicitly link to your
_original_ article, rather than just to places that _didn't_ clearly
explain the algorithm!)

> Ray
> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net


More information about the Toybox mailing list