[Aboriginal] Proposed patches to support modern toolchain

Rob Landley rob at landley.net
Thu Feb 11 11:14:29 PST 2016

On 02/08/2016 11:42 PM, Tristan Van Berkom wrote:
> Hi,
> Sorry for the repeat of this message, it is important to my employer
> (and to me as well) to have this message linkable in the mailing list
> archives and I'm just reposting this message (with a couple of minor
> edits) for posterity. To clarify, the large part of 2015's lost
> archives have been restored (yay!) and in restoring the backup, this
> message and a few others were lost.
> [Original message]
> Hello Aboriginal Mailing List !
>     As some who have been present on irc are aware, I have been working
> on a patch set to allow aboriginal linux to build with the a more
> recent (GPLv3) toolchain.

I note that I will never distribute prebuilt GPLv3 binaries, so either
we retain support for the GPLv2 binaries or we stop distributing
prebuilt binaries entirely.

This is why, long-term, I've wanted to switch to an LLVM based
toolchain. http://ellcc.org is doing that with musl, AOSP builds one as
part of its build (it doesn't build android with the host toolchain, it
builds everything with llvm), and there's instructions on how to build
it in the current "Beyond Linux From Scratch".

I just haven't gotten around to it yet...

> The initial patch set is working now, I have built and tested the
> compiler and distcc on the resulting images for invocations with
> CROSS_COMPILER_HOST=i686 for the armv5l and mips images.

It's a pity the FSF changed licenses. Apple used to us gcc in xcode
until the license change, then froze on the last GPLv2 release for 5
years while they sponsored work on a replacement project.

I.E. the llvm project exists in its current form because I'm not the
ONLY person who was fine with GPLv2 but won't get GPLv3 on them, and is
not going to "get over it" any more than I'd "get over it" about using

> This passes nicely and the resulting image works with your regular 
> dev-environment.sh invocation, the compiler builds hello world in C and
> C++ and also successfully distributes compilation over distcc to the
> cross compiler.
> In this mail I will outline the proposed changes implemented by this
> patch set, along with any caveats and/or observations I can think of at
> the time of writing it. The patch set however documents itself also.
> The patchset is available at the following github branch and a pull
> request was created for it today:
>     https://github.com/gtristan/aboriginal/tree/dual-toolchain
> Enjoy !
> Best Regards,
>     -Tristan
> Overview
> ~~~~~~~~
> This patchset does not replace the existing GPLv2 toolchain but instead
> adds the newer toolchain in such a way that you may choose which
> toolchain to build.

I'm adding toybox to replace busybox, at which point I can remove
busybox. I sometimes give status updates about this in the release notes.

I'm adding musl to replace uClibc, at which point I can remove uClibc.
The main reason for the delay has been lack of target support in musl; I
didn't start adding it until I could at least see a path to removing
uCLibc. (We need still sparc and m68k and so on.)

I planned to add llvm/lld to replace gcc/binutils, at which point I
could remove gcc/binutils. THe main reason for the delay has been the
lack of target support in llvm/lld: you can't build sh4 with llvm.

This adds a second set of infrastructure in parallel, which only some
people will test, only some of the time, in only certain configurations.

That said, people have real world needs and old toolchain is old, so my
concern for this transitional infrastructure (until llvm is ready) is
whether it can be made sufficiently unobtrusive.

> Currently, only musl flavored builds are available with the modern
> toolchain, it is possible to try and fix that so as to build uClibc
> with GCC 5.3 if that is desirable to some, but it's just currently not
> supported.
> By default, the build will continue to build the GPLv2 toolchain, in
> order to build the new toolchain you need only specify:
> In order to allow multiple builds of the same package (i.e. for gcc and
> binutils) I have made some changes to functions.sh and
> download_function.sh.
> Changes to the payload
> ~~~~~~~~~~~~~~~~~~~~~~
> This patch set introduces the following packages and patches to the
> build.
>   Binutils 2.25.1
>   ~~~~~~~~~~~~~~~
>   Latest release tarball of binutils.
>   I have also included the musl related patches, imported from the
>   musl-cross-make project[0].
>   GCC 5.3.0
>   ~~~~~~~~~
>   Latest release tarball of GCC
>   I have also included Gregor Richards' patch set[1] to build
>   GCC 5.3 against musl. These are more up to date than the
>   patches in the musl-cross-make project which target GCC 5.2.0.
>   GMP 4.3.2
>   ~~~~~~~~~
>   The version of GMP used on the gcc infrastructure page[2], we
>   would use the latest version, which is 6.1.0, except that we
>   encounter errors when cross building the native compiler for
>   the target. These exact errors are discussed on the gmp-bugs
>   list in this thread[3].

You know, in email you _can_ just put in the link here, rather than
footnotes and a bibliography...

>   For this older package, we required an update of config.sub
>   and config.guess in order to recognize the -linux-musl* triples,
>   this is introduced in the form of a patch in the sources/patches
>   directory.

The FSF isn't dogfooding current versions of its own packages. Splendid.

>   MPC 1.0.3
>   ~~~~~~~~~
>   Latest release tarball of MPC.
>   MPFR 3.1.3
>   ~~~~~~~~~~
>   Latest release tarball of MPFR.
>   Patches updating config.sub & config.guess
>   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   The config.sub & config.guess needed to be updated for make, bash and
>   distcc in order to compile with the -linux-musl* host triples.

I didn't use the -linux-musl triplets, I had ccwrap override all of that.

For native builds, my build-one-package.sh script does:

  # Lobotomize config.guess so it won't complain about unknown target types.
  # 99% of packages do not care, but autoconf throws a temper tantrum if
  # the version of autoconf that created this back when the package shipped
  # didn't know what a microblaze or hexagon was.  Repeat after me:
  #   "Autoconf is useless"

  for guess in $(find . -name config.guess)
    rm -f "$guess" &&
    echo -e "#!/bin/sh\ngcc -dumpmachine" > "$guess" || exit 1

>   This should not effect builds using the older toolchain using GCC
>   4.2.1, it merely updates these packages to recognize the new triple
>   at build time.
>   Note that config.sub & config.guess are under GPLv3 but include an
>   exception in the license that:
>     "you may include it under the same distribution
>      terms that you use for the rest of that program"
>   As an additional permission under section 7 of GPLv3.

So the license is... public domain?

If I _source_ it from one of those other packages, I get it under that
other package's license...

If this is the FSF trying to be lawyers, I really don't want to use any
of the code where they tried to be security experts.

> Host Tool Changes
> ~~~~~~~~~~~~~~~~~
> To build gcc 5.3, we now require:
>   o System installed c++ compiler
>     GCC now is partly written in C++
>   o System installed ranlib
>     Without this, we encounter problems building gcc, particularly when
>     linking GCC libiberty.a in the final native compiler.

I'm tempted to make a v3build.sh that sets HOST_EXTRA=ranlib. I wonder
how much of this I could do as a wrapper?

> Overview how we build GCC 5.3 compared to 4.2.1
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   o Binutils takes the new flag --enable-install-libiberty, which was
>     implied in older versions.
>     This is currently insubstantial and we could do without installing
>     libiberty here. We only install libiberty for the sake of building
>     elf2flt for the sh2eb target, which is currently only supported
>     with uClibc, which we do not build with GCC 5.3.
>     This change was kept to ease transition to GCC 5.3 once we can
>     supported it, although it may be useless.

Aboriginal Linux is never "transitioning to" GCC 5.3. It's transitioning
to LLVM.

That said, I expect the jcore people are interested in this. :)

>   o We ignore NO_CPLUSPLUS when building the new toolchain, GCC 5.3
>     requires C++ to build itself and even pass it's own configure
>     scripts, so there is no point to try building without C++.

Understood. LLVM has similar deficiencies, which is why I haven't
entirely abandoned the idea of doing qcc someday (with a cfront variant
to build llvm under it).

>   o The first stage compiler (or 'simple' cross compiler) is built
>     in a different order:
>     - Build GCC with only support for the C language
>     - Build libc
>     - Build GCC again, this time with C++ support

I went to great lengths to avoid that in mine. :)

>     This is because you need libc in order to build the libstdc++
>     runtime. And you cannot get away with not building C++ at this
>     stage of course, because you also need C++ to compile gcc in the
>     full canadian cross compiler and native compiler.

Sad how gcc's deteriorated, isn't it?

>   o GCC's new dependencies, GMP, MPC and MPFR are build directly in the
>     gcc build directory, this makes the whole build script a little
>     simpler because we dont have to care about configuring and staging
>     these libraries by hand as they are build as GCC modules.

No point in treating them as separate packages: they aren't really.

> Caveats
> ~~~~~~~
> There remain some dirty hacks and oddities, I will try to specify them
> all here.
>   o We install musl twice
>     When building with GCC 5.3, we install musl twice because our
>     ccwrap program expects musl in one location while the gcc build
>     itself expects to find it in another location.
>     This does not break anything but is redundant and dirty and
>     is relatively easy to fix.


>   o Redundant builds of gmp, mpfr and mpc
>     I did not want to write individual build recipies and try to get
>     all the configure flags right everywhere, and thought it prudent to
>     allow GCC's toplevel configure script to configure those in it's
>     subdirectory as GCC should know how to do that better than us.
>     The downside is that since we necessarily build GCC twice in the
>     simple-cross-compiler stage, we end up building these libraries
>     twice as well.
>     I have tried issuing a make -C ${subdir} install in the first pass
>     and reusing them in the second pass by passing --with-gmp etc
>     during that second pass, and while this satisfies the configure
>     script it also breaks the build for some reason.
>     It could be the only sane fix is to build them completely
>     separately.

I'm really not interested in optimizing the gplv3 build. If it works for
you, great. In a year or two I want to throw out gcc entirely and switch
to llvm, as soon as it supports a reasonable number of targets.

>   o The target triplets may have bugs right now.
>     In order to build GCC 5.3 against musl for any arch, it is
>     necessary to specify the target triplet ending in -musl*

Why? I never bothered with that...

>     The approach I've used to solve this is a little hack in
>     functions.sh which specifies -linux-gnu as the default and
>     substitutes 's/gnu/musl' in the specific case that we are building
>     GCC 5.3 against musl.

Why should gcc care what it's libc is? It's a library implementating an api.

>     The thinking is that in the targets, if a specific CROSS_TARGET is
>     specified, it should absolutely specify the trailing '-gnu' and let
>     the build scripts decide if it is indeed -gnu or -musl.

In arm eabi I had to specify -gnueabi because it didn't work otherwise,
and I never got around to patching it the rest of the way out. There's
no other mention of "gnu" in any of the tuples.

>     A cleaner fix might be to mandate that if the targets specify a
>     CROSS_TARGET, it use a special suffix, so for instance in
>     sources/targets/armv5l we could specify:
>              armv5l-unknown-linux-LIBC-SUFFIXeabi

Way too fancy, it will break.

>     And allow functions.sh to substitute LIBC-SUFFIX depending on which
>     libc happens to be chosen for the given build.
>     There could be various approaches to address this in a cleaner way.

Why does gcc need to know which libc it's building against? The point of
ccwrap is to overrides its attempts to find the headers and dynamic
linker and such, it shouldn't _care_.

If it does care, my "fix" would be surgery to force it to use names like
"libc.so" and provide symlinks to them during the build.

> [0]:https://github.com/richfelker/musl-cross-make
> [1]:https://github.com/GregorR/musl-gcc-patches
> [2]:ftp://gcc.gnu.org/pub/gcc/infrastructure/
> [3]:https://gmplib.org/list-archives/gmp-bugs/2015-December/003848.html

Devoid of context I dunno why you linked to any of that.

Ok, looking at your patches...

First patch modifies download.sh. The way ./download.sh works is that
when you source download_functions.sh START_TIME=`date +%s` snapshots
the current time, and then each "download" function updates the
datestamp on existing tarballs that check out, and at the end
cleanup_oldfiles deletes everything older than START_TIME.

Your if/else test that avoids calling download for certain tarballs will
therefore delete the tarballs for whichever version it's not currently
downloading, forcing them to be re-downloaded later.

(That's as far as I made it looking into this the first time. I have
infrastructure in the build control images to have different download
subdirectories for subprojects, I need to make sure I've got that
working in the top level one and then have a packages/biohazard
subdirectory to download the gplv3 stuff into.)

This _also_ probably works best as a wrapper around download.sh
downloading extra packages... :)

Alright, I've transferred your patch list onto my netbook, I'll see what
I can do...


More information about the Aboriginal mailing list