[Toybox] thinking about tests/ls.test

Rob Landley rob at landley.net
Mon May 18 20:21:14 PDT 2015


So I don't know how to do a tests/ls.test.

For example, "ls -1f lib" shows files in filsystem order, not in sorted order.
But filesystem order can be hash table output that gets completely perturbed
with each write to _other_ parts of the filesystem. Even if I mount my own
initramfs and extract files into it in a known order, I can't control what
I get back (vulnerable to kernel version upgrades). Running the result through
"sort" kinda defeats the purpose.

This sort of test is easy to A) eyeball, B) compare against another
implementation. But it's really not easy to _automate_...

So anyway, my last few commits mentioned tests that were broken and needed
fixing by said commits. ls -R dir" should print "dir:" even when the current
directory is empty (which means "ls -R" with no arguments should print ".:"
because it's equivalent to "ls -R .")

ls -Z was segfaulting, but _not_ when given a single directory argument on
the command line (and thus not when given _no_ arguments, which is equivalent
to the single argument ".") because a single directory argument goes through
a different initialization codepath because you don't print a label: for it.
Anything _else_ (two directories, a file, etc) it would segfault. Oh, also
if given -d or -R (which also negate the "descend into an unlabeled
directory" case).

There's also stuff about when -s should or shouldn't print totals, which comes
from a pedantic reading of posix' ls.html:

  If any of the -l, -n, -s, [XSI] [Option Start] -g, or -o [Option End]
  options is specified, each list of files within the directory shall be
  preceded by a status line indicating the number of file system blocks
  occupied by files in the directory in 512-byte units if the -k option is
  not specified, or 1024-byte units if the -k option is specified, rounded
  up to the next integral number of units, if necessary. In the POSIX
  locale, the format shall be:

  "total %u\n", <number of units in the directory>

Which is especially fun because the gnu/dammit ls is defaulting to -k when
it's not specified and posix says not to do that. (Send a bug report to posix,
I think they're nuts, but that said, it's what the standard currently
requires. Anyway, that's why my default -s units are 512.)

So the above means that in the toybox source, "ls -s lib/*" shouldn't
have a totals: line and "ls lib" should. (Behavior's right, but there's
not test for it in the test suite, so when I go in and fiddle with the
code I have to step _really_ carefully to avoid breaking all these corner
cases that I don't quite remember. This is what a regression test suite
should be for, pointing out the bits you screwed up when you made an
"obvious" change, so you feel safer _making_ obvious changes.)

Unfortunately when I ask for test suites, well meaning people give me
tests like commit ef0ed68d5ba5 which I cut down ala commit d6f8c41e2542
because 99% of what it's testing is the _kernel_ behavior, not toybox's
behavior.

A good rule of thumb for a toybox test is "if this behavior changes, toybox's
code is broken", which is, alas, hard to do right. A lot of tests are "this
behavior could change in ways that don't matter but would cause the test to
fail, and if it _does_ change it's most likely your kernel, libc, or compiler
is broken because this _other_ test already tested the same thing as far as
toybox is concerned and meanwhile you're not testing these possibilities..."

Case in point, from the current tests/ls.test:

testing "ls with wild char" "$IN && ls file*; $OUT" "file1.txt\nfile2.txt\n" \
 "" ""

That wildcard is expanded by the command shell, not by ls. It's not _testing_
ls. (And I just had to wrap it for 80 columns, the file goes over...)

Another fun ls corner case is that I've only implemented LOCALE=C sorting.
The qsort() call is sorting in ascii order, which means it's a case
insensitive sort, so all the CAPITALIZED entries go first. The default
behavior of ubuntu ls is case _insensitive_ sort, so the toybox source
directory goes "lib, LICENSE, main.c, Makefile, README, scripts"...
If you "LC_ALL=C ls" you get the toybox sort order. (Note: I do not have
any LC_ environment variables set, according to strace it's opening
/usr/lib/locale/locale-archive which is a SEVEN MEGABYTE
binary file and I am so not going there. It does that right after reading
all of /proc/filesystems for some reason. No idea why it's doing that.
I note that /usr/lib/locale has one directory in it called "C.UTF-8"
so maybe that's the default locale but strace of ls doesn't seem to have
looked at that and it's not in /etc/locale.alias which is a file I couldn't
BEGIN to explain the existence of...)

Wonder why I'm not doing a lot of locale stuff? I'm so glad this is a
presentation layer issue that x11 and friends have to get right but I can
mostly go "utf8, what a good idea" and leave it at that until people poke me
and go "no, you have to care about this bit"...

Anyway, running TEST_HOST pretty much means exporting "LC_ALL=C" if you
want to match toybox. I should probably upgrade the infrastructure to do
that...

I haven't prioritized doing test suite entries because pending and 1.0 and
android and tizen... And then there's the "posix says to do this but do we
_want_ to do this?" ala the -s without -k above. The gnu default violates
the spec, but that bit of the spec hasn't made sense since the at least the
1990's. Still, our chosing to follow posix there isn't _wrong_. Similary I
didn't implement the full xargs whitespace behavior because we have -0 now
and Rich Felker's dinged me on that and it's on the todo list but... it's
a judgement call.

And then there's the fact that I want the tests to (mostly) work on the
host with TEST_HOST, which can produce very different output sometimes.
But a lot of the tests I get are "does this match the gnu output exactly",
which fails when toybox doesn't match which isn't necessarily the same as
toybox's output being wrong...

Possibly what I need to do is add cleanup.html writeups on each submitted
test suite entry, but as far as I know they're not blocking anything. It's
just an easy thing for me to say when people ask how they can help, and
any test is theoretically better than no test. (I.E. I haven't wanted to
complain about this because what it _means_ is I've been giving people
bad advice. Actually what it _really_ means is I need to do a whole lot
more work extending cleanup.html with test suite cleanups so that people
can then send me what I consider to be good tests...)

I really should tackle this now while recent ls changes (and working out
why I did stuff that way in the first place and what my changes broke)
are fresh in my mind, but there's a whole pile of _other_ smack support
patches, and now that I've stopped the ls warnings "id" is doing warnings...

Rob

P.S. I'm <strike>on a horse</strike> in Tokyo again, so replies may get a
bit spotty. In theory I'm doing a bucket of low-level kernel work so
we can do hardware changes, populating nommu.org and 0pf.net (yes, that's
a zero, the orangutan protection foundation got there first) with actual
content about our Cool New Thing, and preparing a presentation for
linuxcon japan on the off chance that our waitlisted talk gets a room
slot assigned. (That would be nice. I should shake the tree and make puppy
eyes about that...) Anyway, point is: I wanna get the tizen stuff in
because it's my fault it's been blocked so long,  but I'm likely to be
crazy busy through the 7th on other stuff, and expecting ls to be
the hard part and everything else to just drop in now that we've worked
out what the infrastructure should look like seems... optimistic.

P.P.S. It's 6:30 am local time and I haven't gone to sleep yet, so today
is going to be _fun_. Jetlag! The Breakfast of Champions! While simultaneously
being the teatime of champsons!

P.P.P.S. Because I can't make this hotel's internet work (I have a physical
plug, but dhcp drps and renegotiates every 15 seconds) and thus can't post
this until I get to the office, and am thus typing it all into a text file,
so that's why it's not _posted_ at 6:30 am tokyo time.


More information about the Toybox mailing list