[Toybox] diff.c

enh enh at google.com
Fri Aug 26 08:20:36 PDT 2022


On Fri, Aug 26, 2022 at 2:41 AM Rob Landley <rob at landley.net> wrote:

> On 8/25/22 09:43, enh wrote:
> >     *shrug* I've been tracking these things down for years without this
> tool, it's
> >     not a blocker. But the callstack would save time bisecting the code
> with printfs
> >     to figure out where it went off the rails...
> >
> > yeah, although it's not nearly as cool without the stacks, just knowing
> you have
> > a problem is step 1.
>
> It's a hang without ASAN. Not exactly subtle. :)
>
> The immediate problem is that my dump_hunks() is getting lines out of sync
> and
> falling off the end of one of the arrays. The more INTERESTING problem is
> that
> debian's diff says the failing hunk is:
>
> --- dif1.c      2022-08-26 00:54:19.827964685 -0500
> +++ dif2.c      2022-08-26 00:54:41.231964278 -0500
> @@ -8,30 +8,15 @@
>                 return strcmp(ln1->linedata, ln2->linedata) == MATCH;
>  }
>
> -BOOL match(LINE *oldp, LINE *newp)
> +bool match(LINE *oldp, LINE *newp)
>  {
>         int i;
>         for ( i=0; i < minmatch; i++, oldp = oldp->next, newp = newp->next
> )
>                 if ( !eq(oldp, newp) )
> -                       return FALSE;
> -       return TRUE;
> +                       return false;
> +       return true;
>  }
>
> -#if 00
> -void putqln(LINE *pln, DFILE *file)
> -{
> -       if ( ! pln->lneof ) {
> -               if ( in_context_sw )
> -                       if ( file == oldf )
> -                               printf("<<<  ");
> -                       else
> -                               printf("| ");
> -               printf("%s\n", pln->linedata);
> -       }
> -       freeln(pln);
> -}
> -#endif
> -
>  void putqln(LINE *pln, DFILE *file)
>  {
>         if ( ! pln->lneof ) {
>
> Which if you'll notice repeats the last three lines: they're removed right
> after
> the #if and also occur after the #endif as the last three lines. And my
> simple/greedy algorithm is trying to call the first three _matches_ and
> then
> have the rest of the file be one big subtraction, which means it's not
> nicely
> bracketd with matching intro/exit lines. (The find_hunk() logic ensures
> such a
> bracketing, but the dump_hunk() logic's simplistic decision on how to
> display it
> does not.)
>
> Also, debian is saying -8,30 +8,15 and mine's saying -8,29 +8,14 which I'm
> still
> trying to track down....
>
> On the whole, good test case. :)
>
> >     I made it as far as 'both of those have value fa, meaning "Heap left
> redzone"'
> >     and stopped because I have other things to do. This goes on the todo
> heap with
> >     valgrind and making better use of gdb and so on.
> >
> > yeah, like i say --- i've been a heavy asan/hwasan user for years but i
> don't
> > think i've _once_ used the shadow map. as far as i'm concerned it's just
> "how it
> > works", so none of my business. (though when i talked to them about the
> error
> > wording [for which there are now llvm patches up], they said they should
> fix the
> > addresses in the dump to be the actual heap addresses, not the shadow
> addresses.)
>
> People have done things like "electric fence" for decades, usually with a
> horrible performance penalty. After QEMU and xen/kvm got popular intel and
> arm
> got into a race to improve their mmu capabilities and people started
> trying to
> apply that to the memory access pattern validation problems (multiple
> lwn.net
> articles about that a decade or so back) with the dream of making it cheap
> enough to leave on at deployment, and it's nice to see that stuff finally
> bear
> fruit. But it's not exactly new. :)
>
> I first wrote my own heap walker to periodically validate its integrity
> back
> under OS/2. (The codebase I inherited had _five_ alloc/free contexts in
> play all
> at once and every once in a while the OS/2 equivalent of MMAP_ANONYMOUS
> would
> get passed to SOM_free() and it would quietly swallow it and continue for 5
> seconds or so and then an unrelated thread would explode. Yes that was in a
> heavily threaded environment. The OS/2 desktop ("workplace shell")
> instantiated
> new objects by loading shared libraries into a giant shared process space.
> (And
> I think firing up a new thread to run its constructor function? It's been a
> while. I worked on their new package management system, "Feature install",
> which
> was a subclass of the "folder" object in the workplace shell which was
> built on
> top of IBM's System Object Model (one of the first implementations of the
> Common
> Object Request Broker Architecture which was just horrific) which had
> metaclass
> instance objects acting as factories (java did NOT have proper
> metaclasses, at
> least not for many years), but ultimately it all got its memory from the
> heap
> maintained by the C library which got memory from the OS. Of COURSE I
> independently invented page poisoning without knowing what it was called.
> I did
> the same for "linked lists" as a teenager. Heck, in college I reinvented
> bytecode and was all excited about it until I got introduced to java a few
> years
> later.)
>
> This is one of the reasons computer history interests me. The new people
> reinveting the wheel for the 50th time mistake the ruts from heavily
> trodden
> ground for geology.


i don't think that's true of any of the asan/hwasn/mte folks ... my feeling
from working with them for years is that they were quite clear that their
mission was "how do we make these abilities mainstream --- something that
everyone's using, all the time, everywhere?". (aka "your umbrella's no use
if you left it at home" :-) )

to me that's the flip side of the william gibson "the future's already
here, it's just unevenly distributed" quote --- the most useful and
impactful work you can be doing at any given time is making something good
cheap enough that everyone can have one. (obviously that's not an unalloyed
good: i'm happier about bicycle proliferation than car proliferation, for
example, and belatedly moving away from the ICE only solves one of the
problems with cars, but at least down in my corner of the tech world it's
hard to argue against robustness and security. our negative side-effects
are mainly just "cheap != free". a $30 phone still isn't as secure as a
$100 phone, for example [though of course there are positive side-effects
for the less protected devices of having found and fixed more bugs on the
more protected devices and sharing the code].)


> You want to find the REALLY fun ideas, ask why Grace Hopper
> did what she did when inventing shared libraries. (She talked about it in
> the
> HOPL keynote talk, which is in a book in the UT library I photocopied a
> bunch of
> pages out of, but not available online that I know of? In theory the talk
> is on
> video, in PRACTICE I went to the library that claimed to have it and they
> didn't
> want to dig the old VHS tapes out of the back room because they were too
> fragile
> or something... Ah, looks like it might be available online?
> https://dl.acm.org/doi/10.1145/800025.1198341 )
>
> Anyway, back to poking at diff...
>
> Rob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20220826/a0e95a8f/attachment.htm>


More information about the Toybox mailing list