[Toybox] histogram diff

Rob Landley rob at landley.net
Fri Jan 31 06:17:01 PST 2025


On 1/30/25 17:59, Ray Gardner wrote:
>> [patience diff] may have predated the "histogram" algorithm which is
>> surprisingly hard to google for...
>> ...
>> I was really hoping I could implement just ONE algorithm and call it good.
>> Right now it looks like "histogram" would be that one (it's an improvement
>> upon "patience" which is the one I'd planned to implement before somebody
>> else sent in a contribution). ...
> 
> Yes, there was nothing on the Web explaining it well enough to implement it.
> I had to dig to figure it out.
> 
> I have attached a patch to your "branch" diff.c that you've set aside for a
> few years. It replaces the hunky-dory code with a histogram diff> implementation.

I just started work on my diff branch again after 
https://github.com/landley/toybox/issues/489#issuecomment-2588079658 
with the goal of replacing code I don't understand with code I do 
understand.

What is "hunky-dory code" in this context? I just did a streaming diff 
that's more or less the reverse of how my "patch" code works. Where I 
left off was that although the hunk detection does X matching line pairs 
at the start and end to delineate a hunk (in the absence of EOF/SOF 
forcibly delineating a hunk), when it then emits those lines I have to 
manually peel off those matching start/end sets or else the automatic + 
and - emitter don't always naturally give the right set of trailing 
unchanged lines (I had a test input that produced the same size hunk but 
the last 3 weren't all expressed as leading space lines if I just let it 
zip them together), which confused patch.c into thinking this hunk could 
only match at EOF because it had too few trailing match lines.

I want to dig up that specific test again, but it's buried in some old 
directory. That's why I hadn't tried to check stuff in yet...

If I throw what I've done and replace the code I understand with code I 
don't understand, it will go behind "man.c" on the todo list.

Rob


More information about the Toybox mailing list