[Toybox] scripts/llvm-buildall.sh?

Rob Landley rob at landley.net
Wed Aug 4 22:46:07 PDT 2021


On 8/4/21 6:10 PM, Patrick Oppenlander wrote:
> On Wed, Aug 4, 2021 at 8:53 PM Rob Landley <rob at landley.net> wrote:
>>
>> On 8/3/21 7:05 PM, Patrick Oppenlander wrote:
>> >> What I'd LIKE to do is create a scripts/llvm-buildall.sh that builds all the
>> >> supported musl+llvm targets the way mcm-buildall.sh does for musl+gcc. And
>> >> getting llvm-project itself to do that was pretty straightforward (it builds
>> >> them all by default). But clang-rt depends on the target libc headers being
>> >> there first (...why?) and then the invocation is... yeah.
>> >
>> > I mentioned on the musl list that I've had a go at this too.
>>
>> Dig dig dig...
>>
>>   https://www.openwall.com/lists/musl/2021/07/26/3
>>
>> Ooh, very nice. Thanks.
>>
>> > Getting the llvm project & musl to play nice together is a pain. In
>> > case you missed it, what I came up with is here:
>> >
>> > https://github.com/apexrtos/musl-cross-make/blob/clang/litecross/Makefile.clang
>>
>> Let's see...
>>
>> I made it to the giant span of "foreach" starting on line 49 and had to stop for
>> health reasons.
> 
> If you want the llvm build system to build multiple builtin or runtime
> targets there's a lot of configuration required :(

For the _compiler_, it's just:

mkdir -p build-llvm && cd build-llvm &&
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$PREFIX \
    -DLLVM_ENABLE_PROJECTS="clang;lld" "$TOP/llvm-project/llvm" &&
ninja all install &&
cd .. && rm -rf build-llvm || exit 1

With $PREFIX and $TOP set appropriately, of course. The resulting clang
--print-targets lists 39 options, from aarrcchh6644 to whatever "xcore" is.

There's no reason their version of libgcc should be noticeably harder to
configure, AND YET. (What do you mean by "runtime targets"?)

> There's a bunch of cmake "caches" in the llvm project which set a
> bunch of these variables for some canned configurations (e.g. see
> llvm-project/clang/cmake/caches/BaremetalARM.cmake).

Yeah, I rm -rf the build directory each time because otherwise it ignores what
you tell cmake's command line. (Not a fan.)

> But yes, I should probably use a single foreach there though. Not
> really sure why I did it like that.

I'm not a huge fan of trying to scale makefiles beyond trivial usage. Simple
things like "if this file exists, set this variable" turn out to be SURPRISINGLY
HARD TO DO. (And https://en.wikipedia.org/wiki/COMEFROM was originally a JOKE,
yet it's fundamentally how Make works. Mixing imperative and declarative flow
control in the same file...)

>> The next question is, of course, WHY compiler-rt depends on the libc headers.
>> And if it's GOING to depend on a header file:
>>
>> https://github.com/torvalds/linux/blob/master/tools/include/nolibc/nolibc.h
>>
>> Says it's MIT licensed right at the top...
> 
> Nice. They'd need something similar for 15 other OS targets though.

Do any of them other than Windows NOT fall under "Posix values complying with
LP64"? Even all the RTOSes these days are posix+LP64. Compiling for BARE METAL
is "whatever subset of posix we decided to provide for ourselves" plus LP64.

What specifically does compiler-rt need out of these headers? (It was cmake's
configure plumbing dying when I didn't install the headers before trying to
build compiler-rt, not any actual C file trying to #include a header...)

Let's see: grep -r 'include[ \t]*<' compiler-rt/lib/builtins

#include stdint.h is pilot error (Even WINDOWS standardizes char/short/int size
and __PTRDIFF_TYPE__ is another of those builtin #defines in the compiler, yes
that includes clang.)

compiler-rt/lib/builtins/int_endianness.h doesn't need to #include anything for
the same reason:

  clang -dM -E - < /dev/null | grep ENDIAN
  #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
  #define __LITTLE_ENDIAN__ 1
  #define __ORDER_BIG_ENDIAN__ 4321
  #define __ORDER_LITTLE_ENDIAN__ 1234
  #define __ORDER_PDP_ENDIAN__ 3412

unwind-ehabi-helpersh #includes unwind.h but that isn't in libc, it's one of the
/usr/lib/gcc/*/*/include headers.

Why does compiler-rt/lib/builtins/os_version_check.c exist?

...eprintf? Really?

// __eprintf() was used in an old version of <assert.h>.
// It can eventually go away, but it is needed when linking
// .o files built with the old <assert.h>.

clear_cache.c?

// The compiler generates calls to __clear_cache() when creating
// trampoline functions on the stack for use with nested functions.
// It is expected to invalidate the instruction cache for the
// specified range.

I'm going to stop looking at this now. There should probably be a minimal libgcc
replacement project that ISN'T THIS, but gcc also had libgcc and libgcc_eh split
into two libraries for a reason. (Not necessarily a GOOD reason given it was the
FSF doing it, but still...)

> Still, depending on the C library headers seems completely broken.
> 
>> By the way, I tried to subscribe to the llvm-dev mailing list but
>> https://lists.llvm.org/mailman/listinfo/llvm-dev is disabled and emailing that
>> subscribe@ thing multiple times has not gotten a response. That development
>> community is approximately as welcoming to outsiders as xfree86 was.
> 
> You're telling me.

The highly oversimplified backstory is when LLVM went GPLv3, Apple stayed on the
last GPLv2 release in xcode for 5 years and hired a grad student to extend his
thesis into a replacement, and threw engineers at the problem. Then it got
genericized about as well as Darwin or WebKit did. (Apple corporate knew that
proprietary technology gets less trust/adoption, but is biologically incapable
of NOT working that way, so they semi-regularly release abandonware long after
it was developed in secret).

Also, you know how Mozilla sucked at open source because all the Netscape
engineers worked in the same building and their preferred communication channel
was "knock on fellow engineer's door" rather than any sort of mailing list
accessable to anybody else? (Jamie Zawinski published an autopsy on that when he
quit...) Apple's got that issue too...

The primary thing LLVM has going for it is not being GPL. The FSF dared the
world to replace gcc, and they did, but that was a push OFF gcc. The draw _into_
llvm was Apple throwing money at the problem, with Google and the Linux
Foundation adding more money to purge themselves of GPLv3 gcc.

> I've submitted a bunch of patches to phabricator which are in various
> states of landed (their word for merged), accepted (waiting for..
> something?), and bikeshedding.

Never heard of it. Is that how you get stuff in there?

> I've given up beating my head against it for now and am currently
> rebasing this list whenever they release something:
> 
> https://github.com/apexrtos/musl-cross-make/tree/clang/patches/llvm-project-12.0.1.src

Lovely.

>> > * build musl
>> > * install kernel headers
>> > * do an llvm "runtimes" build
>> >
>> > I also create symlinks & config files so that it looks like a normal
>> > gcc style toolchain.
>>
>> The kernel needs to learn that $CROSS-cc is a good compiler name, and $CROSS-gcc
>> is not. And:
>>
>>   https://github.com/torvalds/linux/blob/v5.13/Makefile#L434
>>
>> Is just painful. Unfortunately, I'm not sure how you'd say "use the first of
>> ${CROSS_COMPILE}{clang,gcc,cc} that exists" in Makefile syntax. The foreach
>> stuff expands ALL of them, doesn't seem to have a "break" construct. (And of
>> course musl does it in shell script, which is the sane way to do it. :)
> 
> It's even worse because llvm doesn't install symlinked binaries, so to
> use clang you need to "clang --target=arm-xxx" ala

Wrapper script. (Although a sane clang would do the multiplexer trick of parsing
its argv[0] prefix up to the last - as a --target argument...)

> https://github.com/torvalds/linux/blob/62fb9874f5da54fdb243003b386128037319b219/Makefile#L578
> 
> I work around that in my build by constructing symlinks and .cfg files
> to make clang look like a traditional triple prefixed toolchain.

What IS the deal with the .cfg file? Does it actually have to be in the $PATH?
(And/or same directory as the compiler binary?)

>> > It would be nice if there was one well maintained place to go for an
>> > llvm/musl build recipe.
>>
>> I dunno about "well maintained" but if I can get it to work I'm happy to provide
>> a script, and possibly even host binaries although a gigabyte tarball is a bit
>> of a stretch for my hosting...
> 
> I think my mcm fork should do what you need, but I know you prefer
> more minimal solutions to these problems.

I was happy to leave gcc building to Rich because I'm not getting GPLv3 on me in
a hobbyist context, but a wrapper makefile to build multiple packages? Recursive
make defeats the purpose of make:

https://web.archive.org/web/20030916203331/http://aegis.sourceforge.net/auug97.pdf

So either somebody else does it perfectly all by themselves, or I don't use it.
Because I'm not very interested in contributing patches to such a system. (And
yes that's why I'm not a buildroot developer.)

I've already got a 70 shell script that does all but one package of this. That
one package is compiler-rt, and every time I dig into it I wind up scoping what
it would take to REPLACE...

Rob



More information about the Toybox mailing list