[Toybox] Numeric values in dd operands

scsijon scsijon at lamiaworks.com.au
Tue Feb 20 15:08:22 PST 2018


fyi,

Both quaternary (base4) and octal (base8) are used by analog device 
inputs even today. There's a lot of security systems out there! And you 
can even get them storing and using the data stacked in the rom, where 
octal can appear as hexidecimal but are really upper and lower octal 
pairs, such as resistance tolerance ranges for tape and switches. I 
admit that after some research it seeems most of these devices are now 
octal so maybe I should not have brought quaternary up.

I was asked abour radix (64bit), as it seems too be hard to find data 
on. It's written in double octal and used at least internally (register 
to register) in a number of raid systems and if you know what your doing 
you can use these registers to speed throughput up or change data 
priority. It's also used in Mainframe timeslot control and there is the 
code for linux timeslots available for a number of mainframes but you 
do(did) need implicit approval from LinusT to implement it. All part of 
somethings I did in my previous work-life.

And i'll leave it at that having most likely confused a lot of the 
group. My apologies.

scsijon

On 02/21/2018 04:32 AM, enh wrote:
> On Tue, Feb 20, 2018 at 9:28 AM, Rob Landley <rob at landley.net> wrote:
>> On 02/19/2018 09:09 PM, scsijon wrote:
>>> On 02/20/2018 08:32 AM, toybox-request at lists.landley.net wrote:
>>>> Are you actually using that mid-number multiplier? I was asking on the list last
>>>> year if anyone anywhere actually did that. (It's a relic from before the shell
>>>> provided $((123*456)).)
>>>>
>>>
>>> I always interpreted it as the ability for someone putting an double character,
>>> such as kb in instead of just a k, I have seen 12gb316 used (meaning
>>> 12,316,000,000) in the auto-output for raid drive stats before today. I hadn't
>>> thought of it being a multiplier character.
>>
>> Huh. That's an interesting non-posix extension I hadn't heard of before.
>>
>>>>> The 0.7.5 implementation assumes that x is part of a hexadecimal prefix so 0x12
>>>>> is interpreted as 18 rather than 0, and 3x12 is an error.
>>>
>>> And 3x12 could be interpreted as 3 to the power 12 of whatever base is being
>>> used as it would be in calculus.
>>
>> Not according to http://pubs.opengroup.org/onlinepubs/9699919799/utilities/dd.html
>>
>>> For the bs=, cbs=, ibs=, and obs= operands, the application shall supply an
>>> expression specifying a size in bytes. The expression, expr, can be:
>>>
>>>   A positive decimal number
>>>
>>>   A positive decimal number followed by k, specifying multiplication by 1024
>>>
>>>   A positive decimal number followed by b, specifying multiplication by 512
>>>
>>>   Two or more positive decimal numbers (with or without k or b) separated by
>>>   x, specifying the product of the indicated values
>>>
>>> All of the operands are processed before any input is read.
>>
>> Product means multiply.
>>
>> I'm not _against_ extensions, toybox is also handling megabyte and gigabyte and
>> such because it's not the 1980's anymore. (And blocks went from 512 bytes to 4k
>> ten years ago: https://lwn.net/Articles/322777/ but that's _far_ too recent for
>> posix to even be aware it's happened, let alone respond to it.)
>>
>>> However a leading 0 (0x12) would only define,
>>> not say, it would depend on the base being used, not just hexidecimal. It could
>>> also be quaternary (base4), octal (base8), or even radix (64bit),
>>
>> I didn't make the 0 means octal and 0x means hexadecimal prefixes up:
>>
>>   http://pubs.opengroup.org/onlinepubs/9699919799/functions/strtol.html
>>
>> It's already widely implemented in the Linux command line. Even though posix
>> doesn't say "printf %d 0x1234" should print 4660, ubuntu's printf does (and yes
>> that also means 01234 prints as 668).
>>
>> Every toybox command that takes a number argument has been doing this for years.
>> You can go "head -n 0xa". This is the first complaint about it so far.
>>
>> In toybox I try to err on the side of _consistency_. All the commands behave the
>> _same_ way. This would be an explicit exception where dd count=01234 is _not_
>> interpreted as 668, meaning dd is a special case different from everything else.
>>
>>> all used in computation and hardware data collection input circuits.
>>
>> But not in c99 or posix.
>>
>>> I wonder how you would define/control this?
>>
>> I wouldn't?
>>
>> Base 4 output has never historically been part of any unix command I'm aware of.
>> It's not in c99, it's not in posix, it's not in the linux standard base, it's
>> not in busybox, it's not in ubuntu's current command line, it's not in my old
>> the Red hat 9 qemu image (which I flung on https://busybox.net/downloads/qemu/
>> years ago and is strangely enough still there, although the README isn't. user
>> busybox password busybox, I think that's the root password too.)
>>
>> I'm not trying to make up new stuff, I'm trying to serve an existing userbase
>> base of people who are part of a 50 year tradition. (The first pdp-7 unix was
>> written in 1969, the 50th anniversary is next year.) The best way to serve them
>> is to be consistent with historical practice.
>>
>> I'm also trying to provide something new users can learn easily. The best way to
>> serve _them_ is provide something consistent so they only have to learn a trick
>> once and then it works the same way everywhere.
>>
>> When consistency and historical practice collide, the answer is not always
>> obvious to me.
>>
>> I've often chosen to implement only some of the posix spec, because portions of
>> posix are deeply obsolete. It specifies the "sccs" source control system, batch
>> control commands (qdel and friends), fortran 77, the "ed" line editor, uucp. The
>> roadmap.html page has a section devoted to this.
>>
>> Sometimes you have to explicitly break posix: it says zcat undoes "compress"
>> format (adaptive Lempel-Ziv coding, which was patented in 1984 and utterly dead
>> by the time that expired). Everybody else has zcat undo "deflate", the algorithm
>> introduced by pkzip 2.x in 1993. I care about current reality, not what posix
>> says to do. I only care about posix when it documents current reality.
>>
>> In dd posix specifies ebcdic to ascii conversions, the "swap" byte swapping
>> option (assuming only 16 bit systems have endianness issues), ucase/lcase case
>> mapping that does not _conceptually_ work with multibyte encodings like (let's
>> chop that data into blocks and then do case conversion across block boundaries
>> without ever looking at data from another block...)
>>
>> A real user piped up and said their existing script doesn't work with my tool.
>> That feedback is of interest to me.
>
> and as the person who set us down the strtol path
> (https://github.com/landley/toybox/commit/d5088a059649daf34e729995bb3daa3eb64fa432#diff-ce001a87e82f850a38fd93183e12b417),
> the original request i had was just for hex. like you say, no-one's
> used octal (on purpose) for anything other than mode for decades now.
>
>> Rob
>> _______________________________________________
>> Toybox mailing list
>> Toybox at lists.landley.net
>> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>



More information about the Toybox mailing list