[Toybox] [SC22WG14.32335] Statement expressions
Rob Landley
rob at landley.net
Mon Jul 14 23:15:22 PDT 2025
On 7/14/25 10:01, JeanHeyd Meneide wrote:
> On Mon, Jul 14, 2025 Alejandro Colomar <une+c at alejandro-colomar.es> wrote:
>>
>> Hi Chris, Jakub,
>>
>> I was talking with Elliott Hughes (Bionic maintainer) and Rob Landley
>> (toybox), and Elliott reminded me something:
>>
>> Why did the committee standardize typeof() at all without standardizing
>> ({})? They almost always come together.
>
> Because the Committee moves slow and people don't advocate for
> things that are important, so things have to come one feature at a
> time when someone finally picks it up. For the record, one of the
> earliest mentions of typeof as a potential candidate for
> standardizationw as in the earliest-available C Rationale document; it
> says it needed more bake time despite people's excitement.
>
> Then nothing happened for 30 years.
>
> Also, Statement Expressions were met with great enthusiasm in
> 2007 to be standardized after a paper by Nick Stoughton surveyed all
> of the existing extensions at the time.
>
> Then nothing happened for 19 years.
Toybox's move from claiming c99 to claiming c11
(https://github.com/landley/toybox/commit/3625a260065b and thus
https://github.com/landley/toybox/commit/0c566f6f9a05) was in 2022. At
the time, I thought __has_include() was moving from "compiler extension"
to "standardized", but Elliott corrected me, which was when I asked
about ? : ala:
http://lists.landley.net/pipermail/toybox-landley.net/2025-July/030769.html
And here we are.
My last engagement with the C committee was probably back around 2023
(I'm guessing circa https://landley.net/notes-2023.html#05-02-2023) when
I wanted to implement "read" in toybox's shell but line reads using
readline() and friends used a FILE * that did automatic readahead into
the FILE * buffer, and there was no way to ask that FILE * how much
readahead data was in said buffer so I could fread() it back _out_ and
pass it on to the child process, which has to operate on the underlying
filehandle because that's how processes work:
Ala:
$ echo $'one\ntwo\nthree\nfour\nfive' | \
{ read i; echo GOT=$i; head -n 2; }
GOT=one
two
three
$
That's the desired behavior, but if I don't rescue data out of the FILE
buffer the child process may not see any input because because the
fread() ate it all as readahead and never gave it back.
For a seekable file I can fseeko(ftello(fp)) but the above is a pipe,
which I can't rewind, so I need to pass on the data that was read. I
couldn't find a portable way to ask a FILE * "how much can I fread()
without pulling more data from the filehandle". And I mean it's
frustrating, they BRACKET THAT with a bunch of accessor functions,
there's __fpending() for output bytes:
https://linux.die.net/man/3/__fpending
But nothing for INPUT BYTES read ahead in an INPUT STREAM. (And Elliott
got sad at my read_line() function that read input a byte at a time to
avoid ever overshooting, because it was slow. Hence trying to make
larger read sizes work.)
When I asked the posix committee they said that FILE * was opaque to
them and to go ask the C standards committee, and the C standards person
who replied to my query through the web form thingy said that
filehandles weren't in ISO C at all. (At least posix had fdopen() and
fileno() to translate between the two. The posix side was willing to
reach out, but not to standardize or provide an accessor function to the
contents of the other committee's struct. And the C committee wouldn't
do it because doing I/O through anything OTHER than a FILE * was
"nonstandard". Apparenntly child processes inheriting
stdin/stdout/stderr from a parent are not their problem.)
So even though FILE * always had a variable storing the amount of
remaining input data (it HAS to) the member had a different name on
glibc and on musl and on bsd/mac, with no standard accessor functions,
and neither side wanted to standardize this because each felt it was the
other's responsibility.
(Note that if you set O_DIRECT on the pipe on Linux, it delivers data in
the same granularity it was produced (I.E. it doesn't merge buffers,
each read() stops short at the boundaries of the corresponding write())
which MOSTLY fixes this for real world inputs, although the above test
is still borked because echo is producing a single atomic write that
read(bufsiz) presumably eats before scanning for \n so it would still
consume future lines, but that's as close as I could get and called it
good enough for now.)
> So, the short and long of it is that if someone doesn't do it, it
> doesn't get done.
I moved toybox from saying C99 to saying C11 (in 2022) to work around a
compiler bug in clang. At the time, I thought the __has_include() I was
already using (on both gcc and clang, between that and the macros in the
big ":|cc -dM -E -" dump I eliminated almost all compile-time probes)
came with the new standard. I was also already making regular-ish use of
things like:
else TT.hdr.mode |= (char []){8,8,10,2,6,4,1,8}[tar.type-'0']<<12;
Which vim's syntax highlight only stopped turning red for last year.
(Well, debian version upgrade.) Everything except other way of declaring
inline worked fine in -gnu99 or whatever it was I'd been building under
before on gcc and clang, the move to claiming C11 was mostly "eh, I'm
apparently already using it" and it was a dozen years old at that point.
(And I'd finally given up on backwards compatiblity regression testing
against Unbuntu 2008 because it was missing kernel features I was using.
Although the "centos forever" guys actually using the 10 year support
horizon made sad faces at me shortly afterwards because I broke them.)
Mostly I just test which compilers have what and make do with the
reality at hand. It's a lot easier when you write off Windows as "never
to be supported", then you can rely on things like LP64. :)
If there was a standard that let me remove giant horrible things like:
https://github.com/landley/toybox/blob/master/lib/portability.h
https://github.com/landley/toybox/blob/master/lib/portability.c
I'd be more interested, but this seems unlikely.
(Honestly I'm probably mostly just writing C89 with compiler extensions.
With LP64 I don't really need uint32_t and friends... I guess %p wasn't
in c89? I claimed C99 because I had online copies of the C99 spec and
_didn't_ have online copies of C89...)
> Sincerely,
> JeanHeyd
Rob
More information about the Toybox
mailing list