[Toybox] GitHub Action Example
rob at landley.net
Mon Jun 15 16:12:37 PDT 2020
On 6/15/20 1:42 PM, enh wrote:
> On Mon, Jun 15, 2020 at 11:05 AM Rob Landley <rob at landley.net> wrote:
>>> i don't actually remember us ever having an aarch64-specific issue.
>>> (funnily enough, a 32-bit x86 build would probably find more bugs,
>>> since i don't think anyone regularly tests any 32-bit arch locally. i
>>> certainly only find 32-bit issues when i try to run on an Android
>> I test it before releases, and I test the j-core stuff which is still only 32
>> bit. But it's not tested nearly as regularly as the 64 bit stuff is.
> (to be clear, i meant "at the time of commit". thanks to the 32-bit
> x86 "cuttlefish" emulator testing, we do get testing every time i try
> to sync AOSP. but .)
This wouldn't be at time of commit either, this would be at time of push to
github. Which lately has been a day or so after my local commit on my laptop
because I forget. :P
That said, you've reminded me I should poke at getting an aosp build going in an
emulator again, so I googled for "build and run aosp under qemu" and the second
hit was an actual android link rather than third party weirdness from adware sites:
And that says:
Which hey, seems feasible. A repo sync in the aosp I already downloaded took 22
minutes WITH google fiber (and that was just updating the previous download
which I last synced earlier this month). But at least the repo version that
still runs under python 2 hasn't stopped working yet.
Did you know that "lunch" without options does _not_ list sdk-eng? (Which sounds
like it's building the sdk and not an aosp image to run under the emulator, but
let's at least try what it says first...)
16:57:50 Build sandboxing disabled due to nsjail error.
I did a make without -j and it's launching 6 parallel processes on a 4 processor
laptop. (I should have taskset it, now I know.)
It gave me a package progress indicator disturbingly like yocto's (lemme guess:
they're copying aosp), and it made it to 185 of 186 pretty quickly, and then 185
went for a minute and a half, and it now on 187 of 396 and ONE of us can't count.
Sigh. I keep googling for "intro to AOSP" and either getting youtube videos from
2013. The frist hit that's actually on android.com rather than meetup.com or
which is an explanation of governance philosophy and the "android compatibility
program". (At least the sdk README told me what to try running.)
[ 5% 4111/79676] //external/flatbuffers:flatc clang++ src/idl_gen_php.cpp [linu
Lovely. This thing really really REALLY cannot count. And at this point it's
gonna be eating all 4 processors through dinner.
(I'd kill it and restart it with taskset, but I'm not sure how to "make clean"
and I am that guy who hits every weird dependency bug from incomplete partial
builds pretty much every time...)
So that's happening. Moving on...
>> I'm trying to get the mkroot plumbing to run the test suite under qemu. I'm
>> about 3/4 of the way there. That should get more variants tested in a more
>> easily automatable fashion, but it's all musl (and glibc) unless bionic stops
>> segfaulting "hello world" when run in a chroot that doesn't have /dev/null in it
>> But right now the test suite hasn't got nearly as much coverage as I'd like, and
>> until I fix that running it regularly doesn't really _prove_ anything. (And
>> finishing the shell is eating my cycles...)
> in the Dijkstra sense (page 16 of
> http://homepages.cs.ncl.ac.uk/brian.randell/NATO/nato1969.PDF), sure.
In the "I add tests for the thing I just noticed was broken all the time" sense too.
> but we definitely have introduced test failures. and as we've seen
> from Android, just running them a lot helps shake out issues too ---
> mostly in the tests, but also in the toys. (and anyone running tests
> downstream is going to hit test flake, so that still needs fixing.)
Making the tests properly deterministic can be a challenge at times. As The
Moment said, "I do my best."
> and note also that this checks you can _build_ both glibc and musl
> (which has been problematic in the past), both gcc and clang (which
> has been problematic in the past), and that there are no ASan issues
> that will prevent running the tests/toys on a HWASan/MTE aarch64
Good point. I _have_ tests for all those but they take long enough to run I tend
to only really regression test that as part of my release process unless I
stumble across stuff before then.
The qemu stuff is intended to let me automate it so I can run it more easily and
often, but it doesn't help with the MacOS stuff because Apple went out of their
way to stop MacOS from running under qemu because proprietary and tied to a
hardware dongle in a keyboard controller.
>> And 32 bit argument processing has a known structural limitation (the "truncate
>> -s 8G" thing) which I've mentioned here before. I know how to fix it, but the
>> fix is intrusive enough I'm not sure it's worth doing?
> (i'm much more interested in getting to where we have 64-bit-only,
> both to replace the current 64/32 high end and the 32-only low end.)
I thought Android already mostly gave up 32 bit support (all those old phones
and tablets I can't upgrade past Marshmallow), but the embedded space ain't gonna.
Embedded still deploys plenty of _new_ 16 bit and 8 bit systems each year. Heck,
the AVR processor in the arduino is 8-bit. (ARM is spending a lot of money to
try to convert as much of that to Cortex-M as they can, but that's just this
generation's "all the world's a vax".)
>>>>> it seems like your setup is running on a cron-like timer? is there a way to say "on every push" instead?
>>>> There are three build triggers in the configuration, cron, push on master and pull request on master. As I forked the repo you only see changes to my repo (emolitor/toybox) trigger builds and not landley/toybox.
>>> ah, i see. hopefully rob will look at
>>> and turn this on for the main repo then :-)
>> I'm uncomfortable putting Microsoft Github dependencies directly into toybox,
>> especially now Microsoft seems to be back on its "embrace and extend" kick:
> it's not a dependency though. just a convenience. right now, we have
> humans doing this, and we can always go back to that if we have to.
> but if MS is going to give you free CPU cycles to save a little bit of
> human time...
That would be the "embrace" part, yes. First one's free, don't worry it's
harmless you won't get hooked.
>> Adding a .github subdirectory to the source would be a policy change. I'm happy
>> with a fork doing it, but am uncomfortable putting it in the main repo. (Not
>> fatally uncomfortable, but... ergh?)
> but if it's in a fork, we don't get the benefit. that's basically back
> to humans doing something that's a job for a computer...
I "git push" from the command line and don't look at the at the web gui for days
if not weeks at a time. What does the output of this look like? (Yet more email?)
>>>> --- a/.github/workflows/toybox.yml
>>>> +++ b/.github/workflows/toybox.yml
>>>> @@ -37,6 +37,20 @@ jobs:
>>>> - name: Test
>>>> run: make tests
>>>> + Ubuntu-20_04-Clang-ASAN:
>>>> + runs-on: ubuntu-20.04
>>>> + steps:
>>>> + - uses: actions/checkout at v2
>>>> + - name: Setup
>>>> + run: sudo apt-get install build-essential clang
>>>> + - name: Configure
>>>> + run: make defconfig
>>>> + - name: Build
>>>> + run: CC=clang ASAN=1 make
>>>> + - name: Test
>>>> + run: make tests
>> Given that you aren't setting VERBOSE=fail I assume you want me to add a patch like:
>> diff --git a/scripts/test.sh b/scripts/test.sh
>> index 20f76d09..cdfe3bdb 100755
>> --- a/scripts/test.sh
>> +++ b/scripts/test.sh
>> @@ -60,3 +60,5 @@ else
>> do_test "$i"
>> +[ $FAILCOUNT -eq 0 ]
>> So make can _tell_ it failed?
> (even ignoring the slightly different "run it on Android" wrapper, i
> always run the regular toybox tests with VERBOSE=1. i don't understand
I almost always run with VERBOSE=fail myself.
> why i'd want to _not_ see the detail that might help me fix it.
At the time I was writing it, "output turns into giant piles of noise and the
info you want scrolled off very easily when more than one test fails".
Also, tests shouldn't produce uncaptured output and it's easy to loose them in
the noise when the output is chatty.
(Keep in mind I'm the guy who made the linux kernel build output terse in the
first place, http://lwn.net/2002/0117/a/blueberry.php3 precisely so I could see
the warnings. Losing warnings in the noise has been a pet peeve of mine forever. :)
(And yes, if I seem somewhat crotchety about the python 3 transition it's
because I weathered the python 2 transition, as evidence in that link, and it
DIDN'T FIX ANYTHING. C does not do this, you can still build K&R code with
clang. Shell does not do this. This repeated flag day nonsense python keeps
doing got old.)
> especially if it's a flaky failure rather than trivially reproducible,
> but even in the latter case it's still annoying to have to run again
> with VERBOSE=1.)
The diff makes it hard to tell _which_ test failed, and when there's more than
one they tend to run together.
I now have VERBOSE=1,nopass,fail,xpect
What the default should be is questionable, but the default probably also
shouldn't change every time I come up with a new variant...
More information about the Toybox