[Toybox] dd cleanup notes

enh enh at google.com
Tue Jul 12 11:18:10 PDT 2016


On Tue, Jul 12, 2016 at 11:03 AM, Rob Landley <rob at landley.net> wrote:
> This xmove_fd() thing should just be xopen_nostd() in lib, and I should
> probably go through and audit the existing commands to see if they
> should use it. (I've been doing a lot of mental bookkeeping of "what if
> this returns stdin, will it still work ok" but just having a function
> that DOESN'T is a lot cleaner. And I should plug the holes with
> /dev/null. And this implementation assumes that if our parent closed
> stdout() they didn't provide us an existing fd #3, dunno where they got
> that assumption from...)
>
> The function strsuftoll() is a bit awkward, for a comple reasons.
>
> The big one is we have atolx_range() in lib already, which ALMOST does
> what this wants. But that uses long instead of long long, which is a
> longstanding issue I blogged about recently at
> http://landley.net/notes-2016.html#10-07-2016 and the tl;dr of which is
> "long being 32 bits on 32 bit systems is kinda awkward". I might switch
> atolx_range() to use long long internally soonish, but I can't easily
> change the option parsing plumbing because it depends on LP64 and
> sizeof(pointer) != sizeof(long long) on 32 bits.
>
> The other problem is that the b and w suffixes are defined differently
> here: in atolx() "b" is a synonym for "c" (meaning bytes), and here "b"
> is 512 and "w" is 2. I've literally never seen either of these used, but
> they're in the posix spec... along with ebcdic conversion support and
> some other real lunacy.
>
> Why did I add b and c to atolx?  Commit 87c06e15329a says "find" needs c
> to mean bytes, and commit b8ef889cbfae says od needed "b" to mean bytes.
> Yay source control!
>
> Another thing posix specifies here, which this doesn't seem to do, is
> bs=1024kx1024M (which would be 1 petabyte, but still). If we're ignoring
> that can we get away with ignoring the b and w extensions and just use
> atolx_range()? (Which would be limited to 2G maximum unit size on 32 bit
> systems?) If I'd initially written it that way, I'm pretty sure the
> answer is "yes" (since moving to 64 bit solves that size limitation).
> But since I got a version in pending that people are actually using,
> this would be a regression, so I guess I don't get to make that
> architectural decision. (Pending!)
>
> As for compatibility with "b" and "w", if I was planning to implement
> the majority of what posix says about this command I'd find that
> deviation annoying, but given ebcdic conversion support and 123x234
> already being dropped in this implementation with nobody noticing, I'd
> happily have done without it if it had been my choice (and waited for
> somebody to complain; adding "w" is trivial but silly, adding "b"
> conflicts with the existing "b".)
>
> Would adding "d" to specify decimal sizes be actually useful to anyone?
> The dd man page says:
>
>   N and BYTES may be followed by the following multiplicative suffixes:
>   c =1, w =2, b =512, kB =1000, K =1024, MB =1000*1000, M =1024*1024,
>   xM =M GB =1000*1000*1000, G =1024*1024*1024, and so on for T, P, E,
>   Z, Y.
>
> But what this IMPLEMENTS is
>
>   { "c", 1 }, { "w", 2 }, { "b", 512 },
>   { "kD", 1000 }, { "k", 1024 }, { "K", 1024 },
>   { "MD", 1000000 }, { "M", 1048576 },
>   { "GD", 1000000000 }, { "G", 1073741824 }
>
> I.E. ubuntu has an extra "B" suffix meaning decimal,and this has an
> extra "D" meaning decimal...? (I can add it to atolx(), just... lemme
> know what makes sense here?)

yeah, it's a mess. fwiw, the BSD dd we're currently using on Android
just doesn't support decimal, and no-one's ever said anything. (they
did complain about the inability to specify things in hex [already
fixed], so it's not like they're not paying attention :-) .)

> Hang on...
>
>       case C_BS:
>         TT.in.sz = TT.out.sz = strsuftoll(arg, 1, LONG_MAX);
>         break;
>       case C_IBS:
>         sz = strsuftoll(arg, 1, LONG_MAX);
>         if (!(toys.optflags & C_BS)) TT.in.sz = sz;
>         break;
>       case C_OBS:
>         sz = strsuftoll(arg, 1, LONG_MAX);
>         if (!(toys.optflags & C_BS)) TT.out.sz = sz;
>
> That's capping the range to LONG_MAX. So half the uses here are acting
> like atolx_range() _ANYWAY_, although count= seek= and skip= can do the
> full unsigned long long range. So if you fed in a block size of 4G it
> would go "boing" even on 64 gigs...

the BSD dd we're currently using on Android caps block sizes to
*u*int_max, and the others to *ll*ong_max.

> Let's see, what else:
>
> Does C_COUNT need to be recorded as a flag?
>
>   $ dd if=/dev/zero of=boing bs=1 count=1
>   0+0 records in
>   0+0 records out
>   0 bytes (0 B) copied, 0.000591678 s, 0.0 kB/s
>
> Yes, apparently. Well, I could use -1 since...
>
>   $ dd if=/dev/zero of=boing bs=1 count=-1
>   dd: invalid number ‘-1’
>
> Hmmm, if you seek output but don't conv=notrunc, presumably it truncates
> to _that_ point? (Yes it does. I need to add a test for that to the test
> suite...)
>
> If you're wondering why I haven't cleaned this one up before now... it's
> got issues.
>
> Rob
> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net



-- 
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.



More information about the Toybox mailing list