[Aboriginal] Proposed patches to support modern toolchain

Tristan Van Berkom tristan.vanberkom at codethink.co.uk
Fri Feb 12 02:19:01 PST 2016


On Thu, 2016-02-11 at 13:14 -0600, Rob Landley wrote:
> On 02/08/2016 11:42 PM, Tristan Van Berkom wrote:
> > Hi,
[...]
> > Changes to the payload
> > ~~~~~~~~~~~~~~~~~~~~~~
> > This patch set introduces the following packages and patches to the
> > build.
> > 
> > 
> >   Binutils 2.25.1
> >   ~~~~~~~~~~~~~~~
> >   Latest release tarball of binutils.
> > 
> >   I have also included the musl related patches, imported from the
> >   musl-cross-make project[0].
> > 
> > 
> >   GCC 5.3.0
> >   ~~~~~~~~~
> >   Latest release tarball of GCC
> > 
> >   I have also included Gregor Richards' patch set[1] to build
> >   GCC 5.3 against musl. These are more up to date than the
> >   patches in the musl-cross-make project which target GCC 5.2.0.
> > 
> > 
> >   GMP 4.3.2
> >   ~~~~~~~~~
> >   The version of GMP used on the gcc infrastructure page[2], we
> >   would use the latest version, which is 6.1.0, except that we
> >   encounter errors when cross building the native compiler for
> >   the target. These exact errors are discussed on the gmp-bugs
> >   list in this thread[3].
> 
> You know, in email you _can_ just put in the link here, rather than
> footnotes and a bibliography...

Point taken, I'll take note of your preference of email style for the
mailing list :)

> >   For this older package, we required an update of config.sub
> >   and config.guess in order to recognize the -linux-musl* triples,
> >   this is introduced in the form of a patch in the sources/patches
> >   directory.
> 
> The FSF isn't dogfooding current versions of its own packages.
> Splendid.

It's not as bad as that - since the tarball we use for GMP is an older
one, it is packaged with an older config.sub & config.guess which does
not yet know about musl (the MPC & MPFR we use are bleeding edge, so
those tarballs are already distchecked with the newer
config.sub/config.guess).

> >   MPC 1.0.3
> >   ~~~~~~~~~
> >   Latest release tarball of MPC.
> > 
> > 
> >   MPFR 3.1.3
> >   ~~~~~~~~~~
> >   Latest release tarball of MPFR.
> > 
> > 
> >   Patches updating config.sub & config.guess
> >   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >   The config.sub & config.guess needed to be updated for make, bash
> > and
> >   distcc in order to compile with the -linux-musl* host triples.
> 
> I didn't use the -linux-musl triplets, I had ccwrap override all of
> that.
> 
> For native builds, my build-one-package.sh script does:
> 
>   # Lobotomize config.guess so it won't complain about unknown target
> types.
>   # 99% of packages do not care, but autoconf throws a temper tantrum
> if
>   # the version of autoconf that created this back when the package
> shipped
>   # didn't know what a microblaze or hexagon was.  Repeat after me:
>   #   "Autoconf is useless"
> 
>   for guess in $(find . -name config.guess)
>   do
>     rm -f "$guess" &&
>     echo -e "#!/bin/sh\ngcc -dumpmachine" > "$guess" || exit 1
>   done

Started replying here, but there is more about host triples below so
replying there instead...

> >   This should not effect builds using the older toolchain using GCC
> >   4.2.1, it merely updates these packages to recognize the new
> > triple
> >   at build time.
> > 
> >   Note that config.sub & config.guess are under GPLv3 but include
> > an
> >   exception in the license that:
> > 
> >     "you may include it under the same distribution
> >      terms that you use for the rest of that program"
> > 
> >   As an additional permission under section 7 of GPLv3.
> 
> So the license is... public domain?
> 
> If I _source_ it from one of those other packages, I get it under
> that
> other package's license...
> 
> If this is the FSF trying to be lawyers, I really don't want to use
> any
> of the code where they tried to be security experts.
> 
> > Host Tool Changes
> > ~~~~~~~~~~~~~~~~~
> > To build gcc 5.3, we now require:
> > 
> >   o System installed c++ compiler
> > 
> >     GCC now is partly written in C++
> > 
> >   o System installed ranlib
> > 
> >     Without this, we encounter problems building gcc, particularly
> > when
> >     linking GCC libiberty.a in the final native compiler.
> 
> I'm tempted to make a v3build.sh that sets HOST_EXTRA=ranlib. I
> wonder
> how much of this I could do as a wrapper?

In general, that sounds like a better API than requiring one to specify
ENABLE_GPLV3=1 in the environment (or config), even if under the hood
this script ends up doing just that.

How much could be done in a wrapper script I dont know, this is
basically a tradeoff of code-sharing vs. separation. On one hand it's
desirable to keep the GPLv3 toolchain as separate as possible, on the
other hand it's an undesirable maintenance burden to duplicate all the
scripts in order to achieve greater separation.

The proposed patches introduce complete separation at the 'section'
level, so the build instructions are entirely separate for the binutils
and gcc sections. The main build scripts have a couple of added 'if'
statements here and there, because I judged that it was a lesser evil
than duplicating simple-cross-compiler.sh and native-compiler.sh.

> > Overview how we build GCC 5.3 compared to 4.2.1
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> >   o Binutils takes the new flag --enable-install-libiberty, which
> > was
> >     implied in older versions.
> > 
> >     This is currently insubstantial and we could do without
> > installing
> >     libiberty here. We only install libiberty for the sake of
> > building
> >     elf2flt for the sh2eb target, which is currently only supported
> >     with uClibc, which we do not build with GCC 5.3.
> > 
> >     This change was kept to ease transition to GCC 5.3 once we can
> >     supported it, although it may be useless.
> 
> Aboriginal Linux is never "transitioning to" GCC 5.3. It's
> transitioning
> to LLVM.

Perhaps I misspoke, I do understand the long term plan is to transition
entirely to LLVM.

However, I'm not exactly sure how long that term might be, and it may
turn out to be necessary to build the sh2eb with uClibc and elf2flt
using a modern toolchain... during this transition to LLVM.

But that said, I am personally not a fan of introducing code "in case
we might use it" and as such, I should probably have left out the
install of libiberty.

It's quite arbitrary and I added it to stay consistent with what the
older binutils install was doing, feel free to discard it.

> That said, I expect the jcore people are interested in this. :)
> 
> >   o We ignore NO_CPLUSPLUS when building the new toolchain, GCC 5.3
> >     requires C++ to build itself and even pass it's own configure
> >     scripts, so there is no point to try building without C++.
> 
> Understood. LLVM has similar deficiencies, which is why I haven't
> entirely abandoned the idea of doing qcc someday (with a cfront
> variant
> to build llvm under it).
> 
> >   o The first stage compiler (or 'simple' cross compiler) is built
> >     in a different order:
> > 
> >     - Build GCC with only support for the C language
> >     - Build libc
> >     - Build GCC again, this time with C++ support
> 
> I went to great lengths to avoid that in mine. :)
> 
> >     This is because you need libc in order to build the libstdc++
> >     runtime. And you cannot get away with not building C++ at this
> >     stage of course, because you also need C++ to compile gcc in
> > the
> >     full canadian cross compiler and native compiler.
> 
> Sad how gcc's deteriorated, isn't it?
> 
> >   o GCC's new dependencies, GMP, MPC and MPFR are build directly in
> > the
> >     gcc build directory, this makes the whole build script a little
> >     simpler because we dont have to care about configuring and
> > staging
> >     these libraries by hand as they are build as GCC modules.
> 
> No point in treating them as separate packages: they aren't really.
> 
> > Caveats
> > ~~~~~~~
> > There remain some dirty hacks and oddities, I will try to specify
> > them
> > all here.
> > 
> >   o We install musl twice
> > 
> >     When building with GCC 5.3, we install musl twice because our
> >     ccwrap program expects musl in one location while the gcc build
> >     itself expects to find it in another location.
> > 
> >     This does not break anything but is redundant and dirty and
> >     is relatively easy to fix.
> 
> symlink?
> 
> >   o Redundant builds of gmp, mpfr and mpc
> > 
> >     I did not want to write individual build recipies and try to
> > get
> >     all the configure flags right everywhere, and thought it
> > prudent to
> >     allow GCC's toplevel configure script to configure those in
> > it's
> >     subdirectory as GCC should know how to do that better than us.
> > 
> >     The downside is that since we necessarily build GCC twice in
> > the
> >     simple-cross-compiler stage, we end up building these libraries
> >     twice as well.
> > 
> >     I have tried issuing a make -C ${subdir} install in the first
> > pass
> >     and reusing them in the second pass by passing --with-gmp etc
> >     during that second pass, and while this satisfies the configure
> >     script it also breaks the build for some reason.
> > 
> >     It could be the only sane fix is to build them completely
> >     separately.
> 
> I'm really not interested in optimizing the gplv3 build. If it works
> for
> you, great. In a year or two I want to throw out gcc entirely and
> switch
> to llvm, as soon as it supports a reasonable number of targets.
> 
> >   o The target triplets may have bugs right now.
> > 
> >     In order to build GCC 5.3 against musl for any arch, it is
> >     necessary to specify the target triplet ending in -musl*
> 
> Why? I never bothered with that...

I'm not sure why, a lot of the musl gcc patches (link again):

    https://github.com/GregorR/musl-gcc-patches

patch gcc to recognize the -musl in the triple and behave accordingly,
and many of these patches are already merged in upstream unreleased
gcc.

More below...

> >     The approach I've used to solve this is a little hack in
> >     functions.sh which specifies -linux-gnu as the default and
> >     substitutes 's/gnu/musl' in the specific case that we are
> > building
> >     GCC 5.3 against musl.
> 
> Why should gcc care what it's libc is? It's a library implementating
> an api.
> 
> >     The thinking is that in the targets, if a specific CROSS_TARGET
> > is
> >     specified, it should absolutely specify the trailing '-gnu' and
> > let
> >     the build scripts decide if it is indeed -gnu or -musl.
> 
> In arm eabi I had to specify -gnueabi because it didn't work
> otherwise,
> and I never got around to patching it the rest of the way out.
> There's
> no other mention of "gnu" in any of the tuples.
> 
> >     A cleaner fix might be to mandate that if the targets specify a
> >     CROSS_TARGET, it use a special suffix, so for instance in
> >     sources/targets/armv5l we could specify:
> > 
> >              armv5l-unknown-linux-LIBC-SUFFIXeabi
> 
> Way too fancy, it will break.
> 
> >     And allow functions.sh to substitute LIBC-SUFFIX depending on
> > which
> >     libc happens to be chosen for the given build.
> > 
> >     There could be various approaches to address this in a cleaner
> > way.
> 
> Why does gcc need to know which libc it's building against? The point
> of
> ccwrap is to overrides its attempts to find the headers and dynamic
> linker and such, it shouldn't _care_.
> 
> If it does care, my "fix" would be surgery to force it to use names
> like
> "libc.so" and provide symlinks to them during the build.

I wont make any argument that gcc is correct in it's desire to know
what libc implementation it's compiling against, it is indeed
disturbing, especially since I intend to use this gcc to compile
programs against another libc which this gcc was not compiled against.
It looks like an accident waiting to happen.

That said, as much as I don't agree with what gcc is doing, it is the
stock recommended way to build against musl, as I was informed by the
kind folks in #musl irc. So, my approach was to just try to play by the
rules even if I dont agree with them :)

If you have the knowledge of gcc required to safely patch gcc so that it doesnt care about the libc flavor in the triple, it would certainly be reassuring to me.

It would be even more reassuring if we could convince upstream gcc to drop this requirement, but I wont hold my breath :)


> > [0]:https://github.com/richfelker/musl-cross-make
> > [1]:https://github.com/GregorR/musl-gcc-patches
> > [2]:ftp://gcc.gnu.org/pub/gcc/infrastructure/
> > [3]:https://gmplib.org/list-archives/gmp-bugs/2015-December/003848.
> > html
> 
> Devoid of context I dunno why you linked to any of that.
> 
> Ok, looking at your patches...
> 
> First patch modifies download.sh. The way ./download.sh works is that
> when you source download_functions.sh START_TIME=`date +%s` snapshots
> the current time, and then each "download" function updates the
> datestamp on existing tarballs that check out, and at the end
> cleanup_oldfiles deletes everything older than START_TIME.
> 
> Your if/else test that avoids calling download for certain tarballs
> will
> therefore delete the tarballs for whichever version it's not
> currently
> downloading, forcing them to be re-downloaded later.
> 
> (That's as far as I made it looking into this the first time. I have
> infrastructure in the build control images to have different download
> subdirectories for subprojects, I need to make sure I've got that
> working in the top level one and then have a packages/biohazard
> subdirectory to download the gplv3 stuff into.)
> 
> This _also_ probably works best as a wrapper around download.sh
> downloading extra packages... :)

Maybe, this approach has some issues, but my patches to
download_functions.sh bring us part of the way there already.

The issues I'm referring to revolve around the semantics of the
build_section() and setupfor() functions in functions.sh and that it's
semantics are to setup a package by its _base_ name without specifying
any version.

I modified the setupfor() function and some of the functions it calls
to include a 'variant' argument - this was to allow us to use a
separate patchset for a specific 'variant' of a given package.

Currently, you can either:

   build_section binutils

Or you can choose a variant of binutils:

   build_section binutils gplv3

The latter will result in patches from the 'patches-gplv3' directory
being applied instead of patches from the 'patches' directory.

This of course still requires that there is _only one_ version of
binutils downloaded in the 'packages' directory... if you want to allow
multiple versions of binutils to be simultaneously present, then a bit
more work needs to be done in download_functions.sh.

Perhaps one could add the same optional 'variant' argument to the
download() function so that it downloads to a separate:

    packages-${variant}

directory, and then during the build, the existing invocations to:

    build_section binutils gplv3

would not only fetch patches from the gplv3 variant specific patch directory, but also fetch the tarball from the variant specific packages directory.

> Alright, I've transferred your patch list onto my netbook, I'll see
> what
> I can do...

Thank you for looking into this, I appreciate your taking time to fix
issues on your side, do let me know if you would like me to make
changes to the patches in the existing pull request on my side.

Best Regards,
    -Tristan




More information about the Aboriginal mailing list