[Toybox] diff.c

enh enh at google.com
Thu Aug 25 07:43:10 PDT 2022


On Thu, Aug 25, 2022 at 4:18 AM Rob Landley <rob at landley.net> wrote:

> On 8/24/22 10:22, enh wrote:
> >     xprintf() was called BY NOTHING, and
> >     there's no dump of the data IN the region but there is:
> >
> > did you compile everything with `-fno-omit-frame-pointer`? iirc asan's
> unwinder
> > needs frame pointers (because you want it to be fast, because it's
> recording a
> > lot of stacks). works for free on arm64, but on the z80 with no
> registers that
> > you're using, you'll need to tell the compiler. (if you're using my
> ASAN=1
> > support in toybox you should get that, but maybe you need to do a clean
> rebuild?)
>
> I've made it to poking at Ray Gardner's test case, and:
>
> $ ASAN=1 make clean diff && ./diff thingy1 thingy2
> ...
> ==24453==ERROR: AddressSanitizer: heap-buffer-overflow on address
> 0x60c00002643f
> at pc 0x563d7667e9e9 bp 0x7ffe72b18420 sp 0x7ffe72b18418
> READ of size 1 at 0x60c00002643f thread T0
>     #0 0x563d7667e9e8 in diffcmp toys/pending/diff.c:69
>
> Which is diffcmp(), which was called by nothing. (Rather than by qsort(),
> which
> is normally what calls it?) It did a clean and then rebuilt everything.
> Maybe
> glibc is built without frame pointers?
>

ugh. not being a glibc person, i hadn't thought of this. (there are so many
reasons i can't wait for [non-mac] arm64 laptops to come down to sensible
prices, and having enough registers not to scrimp and save is one of them!)

i'll ask, but since google3 builds its own libc and android _has_ its own
libc, i don't know what the answer will be. (something to add to your musl
build flags though :-) )

(amusingly, a quick web search found a thread where they were talking to
glibc folks about not just building glibc with frame pointers but building
it with asan instrumentation, and were finding memory bugs _in_ glibc that
were getting in the way... that was 2016 though.)


> *shrug* I've been tracking these things down for years without this tool,
> it's
> not a blocker. But the callstack would save time bisecting the code with
> printfs
> to figure out where it went off the rails...
>

yeah, although it's not nearly as cool without the stacks, just knowing you
have a problem is step 1.


> >     Shadow bytes around the buggy address:
> >     ...
> >     =>0x0c047fff8000: fa fa fd fa fa fa 00 04 fa fa 00 04 fa fa[01]fa
> >
> >     Which is completely unrelated to ANY of the above addresses?
> >
> > i never use the shadow so i've never noticed that myself, but i assume
> it's just
> > because it's showing the _shadow_ address, and (since the shadow is
> smaller than
> > the real memory) it can't really show corresponding addresses.
>
> Dunno what a shadow address is... it's the address of a shadow byte.
> Lovely.
> What is...
>
>
> https://stackoverflow.com/questions/61674317/what-are-shadow-bytes-in-addresssanitizer-and-how-should-i-interpret-them
>
> I made it as far as 'both of those have value fa, meaning "Heap left
> redzone"'
> and stopped because I have other things to do. This goes on the todo heap
> with
> valgrind and making better use of gdb and so on.
>

yeah, like i say --- i've been a heavy asan/hwasan user for years but i
don't think i've _once_ used the shadow map. as far as i'm concerned it's
just "how it works", so none of my business. (though when i talked to them
about the error wording [for which there are now llvm patches up], they
said they should fix the addresses in the dump to be the actual heap
addresses, not the shadow addresses.)


> My first job out of college, I came fully up to speed on the debugger
> built into
> the IDE in IBM's OS/2 compiler (as opposed to EMX, the OS/2 port of gcc
> I'd been
> using before), and I used it for EVERYTHING... until I left that job and
> had to
> leave that toolset behind.
>
> Since then?
>
>   https://www.youtube.com/watch?v=nCKkHqlx9dE
>
> If you can't reproduce it from first principles, you're not doing science.
> (Yes
> I built the OS from source. Yes I maintained a C compiler fork. Yes I
> worked on
> a CPU implementation in VHDL. There's a theme here...)
>
> > yeah, it's not just you. you do get used to them -- since they all look
> roughly
> > the same -- but it's unfortunate you have to. (i think just "before" or
> "after"
> > would be good steps forward.)
>
> I lean towards depth first rather than breadth first searches, and I'm just
> climbing OUT of the diff rathole. Trying to close tabs...
>
> Rob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20220825/7cf9d5ae/attachment.htm>


More information about the Toybox mailing list