[Toybox] Toybox 0.8.11 command help clarifications

Wed Nov 13 16:26:48 PST 2024

On 11/13/24 09:31, Craig Poile wrote:
> Absolutely share my questions with the mailing list. Does
> that mean the info will show up in searches made by people like me?

http://lists.landley.net/pipermail/toybox-landley.net/2024-November/030600.html

> Luckily for me, I have no space limitations. I try to balance out
> supplying key information in our doc with recognizing that there
> are detailed specs available for those that need the nitty gritty.

The https://landley.net/toybox/roadmap.html mentions several 
specification sources we use, and at the top of each toys/*/command.c 
file there's generally a link to the most relevant specification, ala 
toys/posix/sed.c saying "See 
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html".

Although usually the relevant man7.org spec is ALSO going to have useful 
data (if present), especially any linux extensions that aren't in posix. 
You can dig down from https://man7.org/linux/man-pages/index.html to 
find them although manual "sections" complicate this a little bit 
(historical unixism from the 1970s when these were printed manual and 
they just shipped the typesetting "troff" data with the OS and made a 
little tool to show it on screen once we switched from teletypes putting 
ink on paper to screen-based serial terminals like the DEC VT100 and IBM 
3720. Unix is OLD. Anyway, toybox commands that live in /bin would 
usually be in section 1 and ones that live in /sbin would be in section 
8, and the others sections aren't SUPPOSED to be command line utilities 
but things like system calls (section 2), libc functions (section 3), 
and file formats (sections 4 and 5), see the "intro" links for an 
overview of each section...)

If it's in the posix spec, the default assumption is we're going to 
implement it and need some excuse not to (often documented at the start 
of the command.c file as "deviations from posix" but we haven't been 
scrupulous about that, and should go back as part of the cleanup for 
releasing an eventual 1.0 version). If it's in the man page but NOT the 
posix spec we need an excuse TO implement it (usually that there are 
existing users out there who poked us, and we went "ok, it's not that 
big/complex" and added it.)

The file toys/examples/skeleton.c has a header section with example 
links to each standard type that comes up a lot. That file (and hello.c 
in the same directory) are the "big" and "small" starting points for new 
commands, so writing a new command usually starts with copying that file 
to a new location, renaming each instance of "skeleton" or "hello" to 
the new command name, doing a "make defconfig; make newcommand" to test 
build it and see we didn't miss any (where newcommand is the name of the 
new command), and then hitting the code in the new .c file with a rock 
until it's command-shaped.

> For du,      -d N       Only depth <N
> 
> I think the Linux man page description was good for my purposes, I
> boiled it down a bit to create a description.
Feel free to completely rewrite them. I would link people directly to 
the posix and man7.org pages, it's just... that's not necessarily what 
toybox actually IMPLEMENTS. It was our frame of reference to diverge from.

My goal was to have one 80x25 screen of text (where possible, yes IBM 
punch cards and the resulting TTY screens are archaic but what's the 
next bigger "standard" size other than "infinite"?) that explained what 
this command does and how to use it.

Unfortunately there's an 80/20 tradeoff (pareto's principle this time, 
not screen size) between "being small" and "explaining thoroughly" where 
both the local peak optimizing for BOTH means each sides misses out 
around 20%. And it's not EFFORT so you can't double the effort and go 
from 80% to 96% (get 80% of what's _left_), it's SIZE being traded off 
so doubling the SIZE would presumably get you 96% of the material 
explained instead of 80%.

$ man ls | wc -l
248
$ /bin/ls --help | wc -l
137
$ toybox ls --help | wc -l
35

$ man blkid | wc -l
300
$ /sbin/blkid --help | wc -l
46
$ toybox --help blkid | wc -l
10

If I stay true to the local peak, my stuff  _can't_ explain as much as 
coreutils or util-linux versions. It's a problem.

 > I played around with> wordings for your short forms, if it's of
 > any use (in your case, "show" is implied):
 >
> Totals for directories N or fewer levels below the command line
> 
> OR
> 
> Directories N or fewer levels below the command line (as totals)

The problem is I already know the answer, so it's hard for me to say 
which is more informative to someone who doesn't. :)

You identified three potholes in the documentation, which I should fill. 
You get to use three times the space and cover 99.2% of the material if 
you want. (Assuming the math holds, which is always dubious.)

> For find, -newerXY  FILE   X=acm time > FILE's Y=acm time (Y=t: FILE is literal time)
>
> I'm going to provide all the possible values for X and Y and a bit of discussion.

Oh sure.

I'm assuming that anybody using "find" on a posix filesystem will have 
encountered the concept of atime, ctime, and mtime somewhere along the 
way, so didn't explain them there. Those aren't "find" concepts, those 
are "filesystem" concepts.

There's a global "toybox --help" page that provides some general assumed 
knowledge I didn't want to repeat in each command, but there's the 
problem of people not knowing to look there. And it doesn't specify 
atime, mtime, and ctime because that's really a linux kernel thing. I 
don't explain that unix time is seconds since midnight january 1st 1970 
in UTC either. (Or explain UTC vs GMT, which is a fraction of a second 
anyway; UTC is atomic clock, GMT is astronomical observations and the 
earth WOBBLES.)

Figuring out where to start explaining is a bit of a problem for me. 
Computer history is a hobby of mine 
(https://www.landley.net/history/mirror/ is from when I thought I'd have 
the spoons to write a book) and left to my own devices I either go 
https://youtu.be/ECRaBC-bTNQ or "the word Electronics came out of World 
War II"... and try hard not to look down at Charles Babbage and the 
mechanical calculators of the 1800s (including the old timey mechanical 
cash registers that went "no sale")

(The hard part of doing a lot of 
https://landley.net/toybox/video/intro-prebuilt-binary.mp4 is figuring 
out what NOT to say. Stream of consciousness is easy, Pascal's Apology 
ala 
https://www.npr.org/sections/13.7/2014/02/03/270680304/this-could-have-been-shorter 
is where the REAL work lies...)

> It will make the Y=t very clear, and the way you worded it, I thought
> it was always acm time (remember I'm not a Linux user or developer)

Access, creation, modification.

A lot of modern filesystems don't actually record creation time (just 
when the file was last modified, since if it's writeable you COULD 
always rewrite the entire file contents so what does creation _prove_ 
anyway).

And access time updates were expensive (doing a recursive grep on a 
directory creates a bunch of WRITE traffic to the disk updating the 
atime in all the inodes, especially not something you want to do on 
flash filesystems with a limited number of write cycles), and tend to be 
switched off using the "noatime" mount option these days (and even when 
they're on it does "smart updates" where it'll only updates the value if 
the change is big enough, usually 24 hours).

Note that atime is useful because if you "touch startfile" in a source 
directory and then run a build in there, and then "find . -newerac 
startfile" it will show you all the files that got opened and read by 
the build. (I.E. what source in here was actually USED.) But only if 
it's actually updating the atime. (You can use something like "find . 
-type f -print0 | xargs -0 touch -a" to set the times far enough in the 
past that "smart atime" will update them, but have to know that you NEED 
to do that...)

Alas, it's complicated. People implement, people optimize, people come 
up with workarounds to defeat the optimizations, the FSF goes out of its 
way to break the workarounds because That's Not How We Wanted You To Be 
Free: Stop Being Free Wrong, people defeat the breakage of the 
workaround to undo the optimization...

The eternal struggle.

> For your short version, I played around again:
> 
> ... (Y=t: FILE is time string)
> 
> For ls, sort by:  (also --sort=longname,longname... ends with alphabetical)
> 
> I'm at a loss for suggestions but I am going to provide a table of
> longname values in a table and finish with a statement like, Any
> conflicts that remain after the specified sorting are resolved by
> sorting alphabetically.

I found it hard to phrase succinctly. :)

Rob