[Toybox] diff.c

Rob Landley rob at landley.net
Wed Aug 24 01:34:15 PDT 2022


On 8/22/22 10:39, enh via Toybox wrote:
> On Sun, Aug 21, 2022 at 9:12 AM Ray Gardner <raygard at gmail.com
> <mailto:raygard at gmail.com>> wrote:
> 
>     I guess I'll skip writing up an explanation of these algos then; I
>     thought you were looking for a less "mathematical" explanation ("why
>     can nobody explain what the old stuff is doing without pretending it's
>     math instead of an algorithm?" "Doug McIlroy's old diff paper from
>     1976 is still written in math-ese. It SEEMS to be describing very
>     simple concepts but it's trying to explain them as if this is
>     calculus, which it is not." etc.) But that's moot if you're going with
>     what appears to be the old "look ahead a little for matching lines to
>     re-sync on" heuristic diff method.
> 
> (fwiw, i think such a document would be a useful thing to leave lying around on
> the internet ... someone will want it sooner or later :-) )

I'm also interested. I've just been a bit distracted because my air conditioner
broke over the weekend (soonest somebody can come fix it is thursday. It's
August in Texas...) and because my diff code isn't quite ready to go back and
run this new test through yet, and I wanted to include that in my reply.

By the way, using ASAN in testing is kind of annoying. I thought running it on
the "it finally compiled, no idea if anything works" stage might be quicker than
my usual "lots of test data and sticking a zillion printf()s into stuff", but..

=================================================================
==12782==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000071
at pc 0x7fdb0f844514 bp 0x7ffd8096d900 sp 0x7ffd8096d0b0
READ of size 2 at 0x602000000071 thread T0
    #0 0x7fdb0f844513  (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x53513)
    #1 0x7fdb0f84502a in __interceptor_vprintf
(/usr/lib/x86_64-linux-gnu/libasan.so.5+0x5402a)
    #2 0x55c63a9693af in xprintf lib/xwrap.c:158

0x602000000071 is located 0 bytes to the right of 1-byte region
[0x602000000070,0x602000000071)
allocated by thread T0 here:
    #0 0x7fdb0f8da330 in __interceptor_malloc
(/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330)
    #1 0x55c63a968fc1 in xmalloc lib/xwrap.c:71

So "zero bytes to the right" is bad, xprintf() was called BY NOTHING, and
there's no dump of the data IN the region but there is:

Shadow bytes around the buggy address:
...
=>0x0c047fff8000: fa fa fd fa fa fa 00 04 fa fa 00 04 fa fa[01]fa

Which is completely unrelated to ANY of the above addresses?

(The bug is I had a missing test so "diff one two" was trying to report "Only in
%(dir): %(file)" for both one and two but the dir part of each was being
calculated on the null terminator for the empty list of directories (not IN a
subdir yet) which should never happen because the missing test would prevent it.
But "ASAN was remarkably unhelpful" was the point of my remark above.)

>     Anyway, I applied your 1-or-2 line patch to the Aug. 8 attachment.c
>     you posted and it still chokes on the two files I sent you (dif_old2.c
>     and dif_new.c). So maybe you've made some other significant changes to
>     that code, if it works for you? It appeared to be stuck in dump_hunk()
>     but I didn't look any deeper.

Lemme get back to you on this. (I'm to the "it compiles but that segfault was
because if (s) should have been if (!s) and now there's another crash later"
stage...)

Sorry I'm not faster. Working on it...

Rob


More information about the Toybox mailing list