[Toybox] Fwd: config2help SEGFAULT

Jazzoo Watchman jazzoowatchman at gmail.com
Wed Nov 25 06:47:41 PST 2020


---------- Forwarded message ---------
From: Jazzoo Watchman <jazzoowatchman at gmail.com>
Date: Wed, Nov 25, 2020 at 8:14 AM
Subject: Re: [Toybox] config2help SEGFAULT
To: Rob Landley <rob at landley.net>


Rob,

Thank you for your time.  I have learned a great deal studying your work
and following you on and off since your time at the "fool."

My post was not meant in any way to criticize or second guess the code in
config2help.c. I seek to understand if my analysis follows the intent of
the author(s) or I missed the point entirely.  And on the off-chance that I
can help-out, to do so.

Cloned Toybox into a new folder.

Toybox builds cleanly.

I apologize for that distraction and thank you for the generous gift of
your time.

I'm trying to build a development environment for an arm926tj-s variant
device for a device that the OED and OEM have NOT released their source nor
their build tools.

Aboriginal-1.4.5 seems like a good starting point to generate the tools I'd
need.

Aboriginal has not been "touched" in 5 years ... but one can hope to stand
on the shoulders ...

If you can/will point me in a more fruitful direction I'd be grateful.

The following are the pieces to reproduce should you deem fit to comment.
Otherwise, please disregard; no harm, no foul.

Thanks again for your time,


James

Build Environment
* Linux 4.19.0-12-amd64 #1 SMP Debian 4.19.152-1 (2020-10-18) x86_64
GNU/Linux
* gcc (Debian 8.3.0-6) 8.3.0
* GNU ld (GNU Binutils for Debian) 2.31.1
* ldd (Debian GLIBC 2.28-10) 2.28

Cloned aboriginal-1.4.5 into an empty folder

ran ./build.sh armv5l

... snip out the clean parts ...
=== toybox (host host-tools)
Snapshot 'toybox'...
scripts/genconfig.sh
cc -o kconfig/conf kconfig/conf.c kconfig/zconf.tab.c -DKBUILD_NO_NLS=1 \
-DPROJECT_NAME=\"ToyBox\"
kconfig/conf -D /dev/null Config.in > /dev/null
scripts/make.sh
Generate headers from toys/*/*.c...
generated/newtoys.h Library probe.......
Make generated/config.h from .config.
generated/flags.h generated/globals.h generated/help.h
scripts/make.sh: line 218: 17069 Segmentation fault
 generated/config2help Config.in $KCONFIG_CONFIG > generated/help.h
make: *** [Makefile:19: toybox] Error 1

Exiting due to errors (host host-tools toybox)

real 0m4.541s
user 0m3.272s
sys 0m1.597s


On 11/24/20 12:35 PM, Jazzoo Watchman wrote:

> > Hello all,
> > Hope this finds you well.
> >
> > Trying to run config2help it segfaults.
>
> It doesn't for me, which means I can't reproduce this, which means I won't
> know
> if I've fixed it.
>
> You don't mention how you did it. Which distro? Which compiler version?
> Which
> build flags? I'm guessing you're using some kind of page poisoning?
>
> > Pulling the source from Toybox and building with
> >   gcc -0 -g -Wall
> >
> > That too segfaults so I think I have the latest version (the source
> seems to be
> > mostly contained in a single file scripts/config2help.c).
> >
> > Google(ing) turned up many hits of fairly recent (circa 2018+)
> conversations on
> > or about this same complaint.
> >
> > Digging into the logic of the code, it seems there may be some cruft/age
> to the
> > logic and some mishandling of pointer use AFTER free (as shown by
> valgrind).  In
>
> It's more that it's a quick and dirty build time tool that works on known
> inputs, so I got it working and then didn't care. The only way it could be
> "exploitable" is somehow tricking someone into running the build with
> different
> inputs...?
>
> The comment at the top of mkflags.c is a bit more explicit:
>
> // This is intentionally crappy code because we control the inputs. It
> leaks
> // memory like a sieve and segfaults if malloc returns null, but does the
> job.
>
> I have the following diff in my tree as a todo:
>
> --- a/scripts/config2help.c
> +++ b/scripts/config2help.c
> @@ -146,7 +146,8 @@ char *dlist_zap(struct double_list **help)
>    struct double_list *dd = dlist_pop(help);
>    char *s = dd->data;
>
> -  free(dd);
> +  memset(dd, 0xaa, sizeof(struct double_list));
> +//  free(dd);
>
>    return s;
>  }
>
> Which might or might not be related to what you're seeing because you
> didn't
> tell me how to reproduce what you're seeing. To be honest, the "fix" I'd
> do is
> just remove the free and leak the data because again: quick and dirty
> build tool
> working on known input? generated/flags.raw is like 10k of data, it can't
> leak
> enough to matter over the lifetime of the tool's run.
>
> (The only reason I have the patch is in case I got around to doing a
> better fix
> than that.)
>
> > "some" aforementioned Google(ing) complaints seemed so stem from an
> update of
> > their glibc.  I can attest to the same as my builds were going  smoothly
> until I
> > did an apt update; apt upgrade .
>
> Ok, so you're using a debian derivative of some sort. This is the first
> information you've given me about your build environment. I'm guessing you
> "upgraded" gcc (or llvm?) to a newer version that enables an address
> sanitization flag by default? But that's just a guess because you didn't
> SAY.
>
> > My point.
> >
> > 1.)
> > Is the latest source (hopefully) with a fix for the "use after free"
> segfault in
> > "https://github.com/landley/toybox/blob/master/scripts/config2help.c"
>
> Given that you apparently haven't built that, now I have to ask "what
> version of
> toybox _did_ you build"?
>
> I don't know what you built, _and_ I don't know how you built it.
>
> > 2.)
> > Following the code in 1.) it seems that some of the logic may be more
> complex
> > than the input would suggest.
>
> It turns out config2help.c did have a reason to exist, yes.
>
> > The code inspecting the generated "sym" list while looping through the
> entries
> > in ".config" and setting sym->enabled ... indicates to me that ".config"
> > controls which "processed" "symbols" get spit-out as "#defined HELPXXX".
> >
> > Is that true and or intended?
>
> The config controls what output is produced. More output has been added
> over
> time to a loop that already had selection logic in it. As I said: quick and
> dirty build tool, which has had its design goals shift out from under it a
> bit
> over the years.
>
> I have a post-1.0 todo item to rewrite kconfig (so I'm not using a stale
> snapshot of gplv2 kernel code), and as part of that all this build
> infrastructure might change, or go away and get completely redone. Dunno
> yet,
> haven't designed the replacement.
>
> > The flow seems to indicate that multiple "symbols" of the same "name"
> may/do
> > exist over multiple Config.in (and "sourced") files; and "duplicates"
> > need/should to be "collated".
>
> The nature of kconfig is there's dependency information: symbols depend on
> other
> symbols. I repurposed kconfig's compile-time help text to provide each
> command's
> --help output, which requires the help text of dependent symbols to be
> collated
> into a single help block with the main command symbol's help text (the one
> that
> controls whether or not a command by that name gets installed), and THAT
> requires the help text to have a common structure sections can be parsed
> out and
> processed differently.
>
> We used to make much heavier use of this (like busybox does), but these
> days
> I've ripped out most command sub-options, with only a few like
> CONFIG_LSPCI_TEXT
> remaining. If I unify all of them like that, then most of the help text
> processing can go away. But I haven't done that yet because it has side
> effects
> I need to think through.
>
> We might still need the FLAG_ enable and disable logic to let the dead code
> elimination part of the compiler's optimizer shrink commands in certain
> configurations, but at the same time without command options that would
> only
> apply to multiple commands in the same command.c file, and most of the time
> those FORCE_FLAGS to disable that behavior when it would apply (for reasons
> probably explained in code.html?) So MAYBE that goes too...? Given that
> LLVM
> also has __COUNTER__ possibly mkflags.c isn't needed at all? Except could
> that
> work with the re-include of flags.h to CLEANUP_xxx ? In theory I can
> snapshot
> the current value of __COUNTER__ and then subtract to get "how many times
> since
> last reset", and MAYBE grep/sed can chop out the optstr and expand it, but
> is
> that really cleaner...?
>
> tl;dr: I used to use certain techniques a lot, these days I don't and have
> removed most of the old instances of them, but there's still a few I
> haven't
> chased down. If I DO chase them down, I need to do a design and analysis
> pass to
> figure out what support infrastructure would still be needed, and I
> haven't yet.
> I've been busy with other things, and it's overshadowed by the need to
> write a
> new menuconfig from scratch. And none of this is pressing because the old
> one
> works for now and I have higher priority todo items.
>
> > In practice; I haven't seen "duplicate" "symbols"
> > and wonder if mashing "duplicates" together wouldn't lead to probable
> > error/nonsense.
>
> CONFIG_LSPCI_TEXT has the help text:
>
>   usage: lspci [-n] [-i FILE ]
>
>   -n    Numeric output (repeat for readable and numeric)
>   -i    PCI ID database (default /usr/share/misc/pci.ids)
>
> Which needs to be spliced into the CONFIG_LSPCI help text. That sort of
> thing
> used to be a lot more common. I haven't yet decided to completely
> eliminate it
> (permanently hardwiring all such options on), and worked out what subset of
> config2help.c would still be needed if I did, because I'm busy with things
> like
> trying to finish toys/*/sh.c
>
> > 3.)
> > If the above are/is true; would a rewrite with less "dependencies" and a
> "clean"
> > valgrind run be at-all welcome?
>
> I haven't done a new design pass on the build infrastructure, in part
> because
> deciding what's stale and what isn't requires janitorial work on the rest
> of the
> tree to make sure it's no longer used (eliminating the last few users),
> and then
> work out the best way to do what's left, which isn't necessarily a subset
> of the
> existing code but may be something entirely different. For one thing, I'm
> most
> of the way through writing a new shell with known behavior which I could
> then
> rely on to have things like "wait -n" so scripts/make.sh could more
> reliably
> keep multiple processors active instead of using the "portable" (to old
> bash
> versions at least) wait for specific $PID version which isn't necessarily
> the
> nex tone to finish because gcc instances take varying amounts of time to
> run
> depending on what .c file they were given... which means maybe I could
> write a
> config2help.c replacement as a shell function using variable arrays, I
> don't
> know yet?
>
> You seem to be volunteering to understand what toybox needs better than I
> do.
>
> > In closing, am I just lost in the woods?  If so, could someone spin me
> around
> > and nudge me in the correct direction?
>
> You had a build break. You didn't tell me how to reproduce the build break
> you're seeing so I can tell whether or not I fixed it. That APPEARS to be
> the
> problem at hand.
>
> One quick and dirty fix to your build break is not to run address
> sanitization
> on build-time tools. (HOSTCC vs target CC.) Alternately, I can remove the
> free()
> so it just leaks that field, because it's just a compile time tool working
> on
> 10k of known input.
>
> Rob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20201125/40888dd9/attachment-0001.htm>


More information about the Toybox mailing list