[Toybox] [PATCH 1/2] Handle large read and write lengths.

Sun Aug 15 02:40:50 PDT 2021

On 8/14/21 7:10 AM, Samanta Navarro wrote:
> Hi Rob,
> 
> I hope that you have recovered from your sickness by now!

Alas, no. :(

That's why I've been holding off on dealing with stuff like this, I wanted to
give it more attention than I currently have, but eh, you work with the brain
you've got not the one you'd like.

(Warning: Pascal's Apology for writing a long letter because he didn't have the
focus/spoons to write a short letter: my primary failure mode is
blathering/tabsploison/tangents and I type fast.)

> On Mon, Aug 09, 2021 at 02:44:45AM -0500, Rob Landley wrote:
>> > The functions readall and writeall can return an error value by mistake
>> > if more than 2 GB of data are read or written.
>> 
>> That was intentional. If your file sizes are that big we probably want to mmap()
>> stuff.
> 
> The functions read and mmap have different use cases.

Yes, I know.

> You cannot mmap a
> pipe, a socket or any other form of byte streams.

Yes, I know.

>> Actually on 32 bit Linux architectures ssize_t is also long long because "large
>> file support" was introduced over 20 years ago:
> 
> Did you mean off_t? The signed size_t type is 32 bit on 32 bit systems.
> But off_t depends on large file support compiled in. So it's sometimes
> 32 and sometimes 64 bit.

This is why I hate the magic macros. I always have to look up what they mean
when. I use the real types wherever possible because they I DON'T have to track
weirdness.

When dealing with C, I like to know what I'm doing.

>> So if we're changing the type it should change to long long
> 
> I disagree here.

As do I: if the old one is sometimes int then the new one should be int, I.E.
what it is now. I just need to audit the callers, and add error checking in some
of them.

> First off, I would not recommend to use "long long"
> just because it's most of the times of the same size.

It's the 64 bit primitive integer type?

> The data types
> exist for a reason, the most important I think is the implied intention
> of their use cases.

Toybox relies on the LP64 memory model, as documented in
https://landley.net/toybox/design.html#portability

> Use size_t for memory operations.

No.

> Use off_t for file operations.

No.

I am _aware_ of what the standards say, I just don't care about portability to
Windows. And nothing else _isn't_ LP64 anymore. (And even Windows has added its
own eniw implementation, that's wine spelled backwards with according to Sam
Vimes the possible addition of eniru, to run linux binaries on windows.)

Linux is lp64, bsd is lp64, mac is lp64, android and ios are lp64, zseries is
lp64, slowaris and aix and hp-ux and irix and OSF/1 all were lp64... There were
a couple of historical oddballs that got things wrong, like the Hal Solaris port
of Solaris wasn't compatible with Sun's Solaris (and its corpse was acquired by
Fujitsu in 1993) or the truly INSANE 1990's Cray Unicos version that made even
"short" 64 bits got replaced with an LP64 version over 20 years ago, but nothing
since the whole Y2K frenzy that I am aware of.

You have indeed identified a real bug: read with a size range restricted to int
can have callers asking for more data. My reply was that it never SHOULD have
callers asking for more data (that's pilot error), and I need to figure out how
to enforce that. My suggested FIX is to make sure the xread/xwrite callers are
all ok with a 2 gigabyte granularity.

What is the actual _problem_ with that fix? (Elliott and I already had a tangent
about readline(). Most of the other "process a buffer in a loop" code uses
toybuf or libbuf aka page size, which is cache friendly while avoiding the worst
of byte-at-a-time processing overhead.)

> Use
> long long if your prefered C standard is too old for int64_t or the
> API of library functions in use want long long.

Bionic:

libc/include/stdint.h:typedef long long __int64_t;
libc/include/stdint.h:typedef __int64_t     int64_t;

Musl-libc:

include/alltypes.h.in:TYPEDEF _Int64          int64_t;
arch/i386/bits/alltypes.h.in:#define _Int64 long long
arch/aarch64/bits/alltypes.h.in:#define _Int64 long
arch/mips64/bits/alltypes.h.in:#define _Int64 long
arch/mips/bits/alltypes.h.in:#define _Int64 long long
arch/mipsn32/bits/alltypes.h.in:#define _Int64 long long
arch/powerpc64/bits/alltypes.h.in:#define _Int64 long
arch/s390x/bits/alltypes.h.in:#define _Int64 long
arch/powerpc/bits/alltypes.h.in:#define _Int64 long long
arch/x32/bits/alltypes.h.in:#define _Int64 long long
arch/x86_64/bits/alltypes.h.in:#define _Int64 long
arch/or1k/bits/alltypes.h.in:#define _Int64 long long
arch/sh/bits/alltypes.h.in:#define _Int64 long long
arch/xtensa/bits/alltypes.h.in:#define _Int64 long long
arch/microblaze/bits/alltypes.h.in:#define _Int64 long long
arch/arm/bits/alltypes.h.in:#define _Int64 long long

glibc is of course an ifdef salad that checks if sizeof(long) is 64 bits and
uses that else uses long long, because GNU being chock full of unnecessary code
that does nothing (and often does nothing WRONG) is why busybox and uclibc and
so on existed in the first place...

> Since read and write are used to operate on memory, size_t is the best
> choice.

Destination in memory and transaction size are two different things. Offset
within the file is a third thing. Each could theoretically have a different type.

> Or ssize_t for included error handling. And this is exactly what
> the underlying C library functions do.

I first triaged the C spec in 1991 (yes using Herbert Schildt's annotated
version, it was all I could afford), because as a teenager I independently
reinvented the concept of bytecode (not knowing about the half-dozen previous
instances like Pascal p-code) and I was trying to come up with a C compiler that
would produce bytecode running in a VM. (This is why I first started reading the
gcc source code, which was UNIMAGINABLY bad. I also looked at the "small C
compiler" from Dr. Dobbs, but it wasn't load bearing. Years later I got involved
in https://landley.net/hg/tinycc as vestigial momentum from that. And yes,
bytecode function pointers and native function pointers would have to be two
different types so would need annotating...)

Then I graduated and went to work at IBM porting OS/2 to the PowerPC where a
coworker introduced me to Java, and I pointed out that they'd missed "truncate",
and Mark English of sun replied that I'd just missed the 1.1 cutoff but he'd add
it to java 1.2 (and did). Meanwhile I sat down and started porting the "deflate"
algorithm to java (based on the info-zip code), and got compression side working
but not yet decompression when the next version came out with zlib as part of
the java standard library. So I'm used to doing work that gets abandoned or
undone again, that's normal.

I've had a copy of https://landey.net/c99-draft.html on my own website for easy
reference since the busybox days (so at least 15 years now). The header says it
came from
http://web.archive.org/web/20050207010641/http://dev.unicals.com/papers/c99-draft.html
so that's probably about when I snapshot it (Feb 2005, I tend to mirror locally
when the original goes down).

I am entirely happy to have my opinions on this stuff changed by good arguments,
but "reading the spec to me" is probably only going to provide new information
if I _forgot_ something.

>> P.S. One of my most unsolvable todo items is what to do about readline() on
>> /dev/zero. If it's looking for /n it's just gonna allocate a bigger and bigger
>> buffer until it triggers the OOM killer. If a single line IS a gigabyte long,
>> what am I supposed to _do_ about it?
> 
> I would say: Do what open does on a system without large file suport
> with large files: Return an error.

Elliott outranks you, and he basically called it pilot error. (And treating it
as such is the status quo.)

I couldn't figure out what to do about this when I WASN'T sick, so...

> And as it has been discussed in enh's thread: It depends on the
> application. Does it need random access after parsing? Can it have
> random access on the file? Is a streaming approach possible?

mmap() on a 32 bit processor won't help much because your VIRTUAL address space
is capped between 1 and 4 gigs depending on which Linux memory model you
compiled the kernel with. You can remap as you go, but your "one big line"
physically CAN'T be longer than 4 gigabytes and get seen as a block of data in
memory. (And in theory there's both x32 and the arm equivalant ala
https://lkml.org/lkml/2018/5/16/207 so this isn't _entirely_ a historical point.)

The tool can't demand a use case, because I do:

  diff -u <(thingy) <(thingy2)

All the time. Even if diff input is USUALLY seekable/mappable that isn't and
diff needs to work for that.

The question is "what do I want/need to support" (or at least opitimize for),
and the answer sadly can't always be "everything".

Having TWO codepaths to do the same thing in "fast" and "slow" cases is an
invitation to bugs and something toybox tries to avoid, and yes "tail" does that
and there's a comment apologizing for it but the speed difference on tailing a
500 gigabyte file is big enough I couldn't get away with NOT doing it, and yes
there were posts about it here in the archive and I probably blogged about it at
the time, but it's one of them Kobiyashi Maru no win scenario things. You don't
have to believe in them, reality is what persists when you STOP believing in it.
Or possibly what happens while you're making other plans.

Anyway, I dunno how to fix it and I'm out of brain.

> Sincerely,
> Samanta

Rob

P.S. The J.J. Abrams movies have done to ST2TWOK what the Hitchhiker's Guide
movie did to the BBC miniseries, or what GPLv3 did to GPLv2.

P.P.S. Kobiyashi means "small forest" and Maru means "circle" in japanese. These
days it's where you find dragon maids and/or a particularly photogenic cat.