[Toybox] [SC22WG14.32335] Statement expressions

enh enh at google.com
Tue Jul 15 11:35:56 PDT 2025


On Tue, Jul 15, 2025 at 2:14 PM JeanHeyd Meneide
<phdofthehouse at gmail.com> wrote:
>
> On Tue, Jul 15, 2025 enh <enh at google.com> wrote:
> >
> > ...
> >
> > POSIX 2024 finally has a standard <endian.h>
> > (https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/endian.h.html)
> > so that's likely to be available everywhere toybox needs it sooner
> > than <stdbit.h>.
> >
> > (i'm still annoyed that the standard says <stdbit.h> should also have
> > _functions_ rather than leaving it to the implementation to decide
> > whether macros or functions make more sense, and don't plan on adding
> > the functions to Android. even more annoyingly, without this
> > requirement we could probably have just got <stdbit.h> into _llvm_
> > [and gcc] so every libc wouldn't need to roll their own :-( )
>
>      I think you already told me, but remind me again what was the
> specific problem of them being functions versus just macros? My
> understanding of the core issue is that having an (extern) function
> might leave it up to the optimizer or compiler infrastructure to do a
> good job, whereas using a macro + builtin (or macro + statement
> expression w/ builtin) meant there was no requirement for the
> compilers to recognize the (extern) symbol and replace it?

that's one of the negatives, yes.

there's also the problem of "we can't change the past ... except if
the future is just a macro". we can't magically make functions appear
on old systems, whereas macros using builtins/statement expressions
are backwards compatible for free. so with functions, no-one gets to
use this stuff for another decade.

plus -- for all the architectures we support -- these are all just an
instruction or two [if not a compile-time constant!] so the function
call overhead would really mean "you should probably just use the
builtin directly", leaving the fundamental problem of "everyone writes
their own version of these macros" unsolved.

i get that there are crappy cpus where some/all of these are going to
be function calls anyway, and there are going to be compilers where
you can't write the macros (because there's no _standard_ way to write
the macros), but trying to disallow the _option_ just seemed like a
pointless kick in the nuts for those who can do better.

and, like i said in the previous mail, if "macro only" was an option,
we'd be more likely to have "one true <stdbit.h>" in each _compiler_
rather than in each libc. (which is obviously a choice you might want
to make either way -- the musl guy is certain to make the opposite
choice, for example -- but from the perspective of all the developers
i care about "Android and iOS literally have the same header" would
have been great.)

>      If I remembered that correctly, my workaround for that when the
> quality-of-implementation on the compiler's side is too weak is to
> first define the function and then layer it over with a same-name
> macro:
>
> extern unsigned int stdc_count_ones_ui (unsigned int __value);
> #define stdc_count_ones_ui(_VALUE) __builtin_popcount(_VALUE);
>
> int main () {
>      unsigned int x = 0b001100;
>      unsigned int y = 0;
> #if 0
>      // UNCOMMENT to trigger a LINKER error, showing its
>      // referring to the actual exported function
>      auto fn_ptr = stdc_count_ones_ui;
>      y = fn_ptr(x);
> #endif
>      return stdc_count_ones_ui(x) + y;
> }
>
>      ( https://godbolt.org/z/6MfKfxq6f ) IIRC __builtin_pocpount is
> the one without UB, so it should be fine to call naked in a macro like
> this. Others might need either ? : or statement expressions to fix it
> (such as __builtin_clz or __builtin_ctz, I believe).

it's not as bad as that (having implemented all this as macros for
clang [which unlike gcc doesn't have any fancy new stdbit-specific
builtins] last week) --- the clzg and ctzg builtins both take an
optional second argument as the "value to use in the UB case".

i think stdc_bit_ceil(), stdc_bit_floor(), and stdc_has_single_bit()
are the only ones that _need_ statement expressions. (assuming you're
guaranteeing the argument is only evaluated once.)

> But this should
> buffer things while compilers get around to universally recognizing
> these function calls as the new "builtins", so we have a more
> even-handed performance across all implementations.
>
>      One of the things strongly encouraged was that there was an
> external symbol that could be stuck in a function pointer, and making
> it unspecified whether there was one or not was a no-go for a few
> people and vendors (that is, they were not comfortable with the
> eventual consequences of leaving these to be macros). I was hoping
> this would be enough to get across the finish line for most people,
> combined with the type-generic macro version of each bit function
> (e.g. stdc_count_ones, without a suffix) that still allows for
> high-quality of implementation if they were used directly.
>
>      But maybe that's not a good enough set of options?

yeah, i think it's almost always going to be a mistake to end up
calling the functions (rather than having an instruction or two
inlined), and the 0.001% that actually have a legitimate use for that
can trivially do

static my_stdc_foo_uc(unsigned char x) { return stdc_foo_uc(x); }

in the place that wants a function pointer anyway.

to me this is a case of academic purity getting in the way of the
least bad implementation option (both in terms of "performance" and
also "can i use it _today_, on _existing_ deployed systems?"). at
least for those of us with the luxury of not having to worry about
1980s cpus or 1-cent microcontrollers or whatever :-) (though i'd
argue even for them, it doesn't really matter whether the function
they eventually end up calling is in libc or in compiler-rt/libgcc.)

> Sincerely,
> JeanHeyd


More information about the Toybox mailing list