[Toybox] [PATCH] Add ASAN=1 to the build system.

enh enh at google.com
Thu Sep 12 12:17:51 PDT 2019


On Mon, Jul 29, 2019 at 2:28 AM Rob Landley <rob at landley.net> wrote:
>
> On 7/29/19 12:19 AM, enh wrote:
> >> yeah, Google first started to implemented it in GCC, but gave up and
> >> reimplemented it much better in clang. i don't actually know what the
> >> state of the GCC stuff is beyond "was never as good". i thought they'd
> >> removed it, tbh.
> >
> > i ran the same experiment with GCC 7.3 on Debian, and it caught the
> > error too, so even though i don't know what state the GCC
> > implementation is in, it's clearly not useless.
>
> As long as it doesn't break the build or the resulting binary's behavior, I lean
> towards just having it switch it on regardless of toolchain.
>
> >>> What did you _mean_ to do? Why was it failing for you before? I'm confused.
> >>
> >> build with CC="clang" and see all the warnings. the current test isn't
> >> working, but my guess as to why was wrong and "worked" by accident.
> >
> > looking at this, i think i must have done the `make defconfig` without
> > `CC=clang`. i can't reproduce the issues i was seeing otherwise. (i
> > should `export X=y` rather than remembering to always supply it on
> > every make invocation...)
>
> I've switched toolchains without a "make distclean" in between on a number of
> occasions, and learned to recognize the flying debris. I've got three of 'em I'm
> trying to keep reasonably current: one each for glibc, musl, and bionic.
>
> There's plumbing in make.sh that _attempts_ to catch make defconfig and make
> having wildly different flags, but it could use more work. It's this bit:
>
> if ! cmp -s <(genbuildsh 2>/dev/null | head -n 6 ; echo LINK="'"$LDOPTIMIZE
> $LDFLAGS) <(head -n 7 generated/build.sh 2>/dev/null | $SED '7s/ -o .*//')
> then
>   echo -n "Library probe"
>
> That checks the $PATH, compiler command line (with all flags), and link command
> line (with all flags) against the one used last time, and if they don't match
> attempts to rebuild them.
>
> It should probably just blow away generated/ and rebuild all of it. (There's no
> shortage of corners of the project I could sit down and spend a weekend on...)
>
> >>>> I've also fixed (and modernized) the "are we root?" check in the
> >>>> hostname test,
> >>>
> >>> Was it broken, or just ugly? (Fixed implies broken?)
> >>
> >> the _test_ was okay, but the _output_ was wrong and confusing.
> >
> > that was fixed separately in 26f3ca413c7fa7b1ba380f3c951004c109a47294.
> >
> > so... i've attached a new patch that just does the CFLAGS fiddling.
> > tested with both clang and gcc (on host Debian), and catches the grep
> > bugs if i revert the fix.
> >
> > [PATCH] Add ASAN=1 to the build system.
>
> Applied.
>
> > Just use `ASAN=1 make test_grep` or whatever.
> >
> > You'll probably want to set $ASAN_SYMBOLIZER_PATH to point to
> > llvm-symbolizer, but Debian makes that annoying by calling the
> > symbolizer /usr/bin/llvm-symbolizer-4.0 or whatever, and ASan refuses to
> > use it:
>
> What's a symbolizer?

"addr2line".

> Also, before this release I really need to fluff out the FAQ. I've got over a
> dozen half-written FAQ entries queued up (including a bunch I wrote for busybox
> back in the day which apply to both projects, some of which have sadly been
> _screwed_up_ in the busybox FAQ since I left...)
>
> I should probably have a FAQ entry about ASAN, because right _now_ I dunno
> what/how to use it for. (Honestly, there should be a "modern lint" howto. I
> generally find static analyzers to be false positive generators, and the various
> gcc things like libmudflap never seemed useful. But I've had a todo item to get
> valgrind to actually do something useful for many moons now. It's... not at the
> top of the list. :)

(markdown follows, but this should be something you can paste into the
FAQ to get started...)

# Using ASan

[ASan](https://github.com/google/sanitizers/wiki/AddressSanitizer) detects
use-after-free, use-after-return, and heap/stack/global buffer overflows.

To build toybox with ASan, set `ASAN=1`.

To get symbolized stacks, set `ASAN_SYMBOLIZER_PATH` to point to your
llvm-symbolizer binary. On most Linux distributions, this will have the LLVM
version number in the name for some reason, so typing `llvm-symbolizer` and
pressing tab is probably the best way to find what you have installed. Note
that ASan currently requires the full path (including `/usr/bin/` or whatever).

When a toybox built with ASan overflows a buffer or double frees, it will abort
with a clear error message and stack traces for both the point at which the
error occurred and also (where relevant) the point at which the memory was
allocated.

Note that mixing asan and non-asan objects is not supported, so you'll want
to `make clean` if you're turning `ASAN=1` on and off.

Testing with ASan is as simple as building with `ASAN=1` and then running
the tests as normal.

You can check whether your toybox binary is using ASan with lld(1)
or readelf(1) --- there should be a reference to `libasan.so`.

> I also want to run perf or similar against top and figure out what percentage of
> the CPU usage comes from where. The UTF8 display and the /proc parsing are the
> top two candidates. Alas, it does a lot of string manipulation, and that's not
> friendly to anybody's cache. But I'd like to get the CPU usage down from where
> it is. Also not near the top of the todo list (performance improvements are
> _mostly_ post-1.0 work, although they get bumped up when a user complains).
>
> Rob


More information about the Toybox mailing list