[Toybox] [PATCH 1/2] Handle large read and write lengths.

enh enh at google.com
Mon Aug 9 14:20:29 PDT 2021


On Mon, Aug 9, 2021 at 12:26 AM Rob Landley <rob at landley.net> wrote:

> Sorry for the delay, I have a _really_ bad cold.
>
> On 8/7/21 7:11 AM, Samanta Navarro wrote:
> > The functions readall and writeall can return an error value by mistake
> > if more than 2 GB of data are read or written.
>
> That was intentional. If your file sizes are that big we probably want to
> mmap()
> stuff. Single atomic transactions greater than 2 gigabytes are probably a
> bad
> idea from a latency standpoint (circa 2015 kernel used to hold locks across
> these, probably fixed now but I haven't checked) and mallocing buffers
> that big
> is probably also a bad idea. (I hit a glibc version that wouldn't allow a
> malloc
> greater than 128 megs, haven't tested recently...)
>
> That said, I'd want it to hard error rather than integer overflow back
> into a
> sane value (6 gigs being treated as 2...)
>
> > This happens because int and ssize_t are of different sizes on 64 bit
> > architectures. Using ssize_t resolves the issue because read and write
> > return ssize_t already.
>
> Actually on 32 bit Linux architectures ssize_t is also long long because
> "large
> file support" was introduced over 20 years ago:
>
> https://static.lwn.net/1999/0121/a/lfs.html
>
> And 5 years later even 2 terabytes was limiting:
>
> https://lwn.net/Articles/91731/
>
> So if we're changing the type it should change to long long, but this is
> really
> pilot error in the callers. (That said, readfd() and lskip() are making
> this
> mistake, so there needs to _be_ an enforcement mechanism...)
>
> Rob
>
> P.S. One of my most unsolvable todo items is what to do about readline() on
> /dev/zero. If it's looking for /n it's just gonna allocate a bigger and
> bigger
> buffer until it triggers the OOM killer. If a single line IS a gigabyte
> long,
> what am I supposed to _do_ about it?
>

nothing? seems like toybox should just do what was asked, not least because
whether a gigabyte is large or small is a matter of opinion? (he says,
still deeply scarred by all the Bell Labs boys' fixed-length buffers... i
had to use _perl_ because of them! PERL!)


> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20210809/3a0509fd/attachment.htm>


More information about the Toybox mailing list