[Toybox] N flushing pattern space in gnu/dammit sed

Rob Landley rob at landley.net
Sun Nov 16 12:12:31 PST 2014


I ran the busybox sed test suite against toybox's new sed implementation
(eh, it was there) and busybox has these three tests:

  # non-GNU sed: N does _not_ flush pattern space, therefore c eaten @script end
  # GNU sed: N flushes pattern space, therefore c is printed too @ script end
  testing "sed N (flushes pattern space (GNU behavior))" "sed -e 'N;p'" \
         "a\nb\na\nb\nc\n" "" "a\nb\nc\n"
  testing "sed N test2" "sed ':a;N;s/\n/ /;ta'" "a b c\n" "" "a\nb\nc\n"
  testing "sed N test3" "sed 'N;s/\n/ /'" "a b\nc\n" "" "a\nb\nc\n"

These tests are for a gnu/dammit behavior that doesn't just "extend" the posix
spec but explicitly violates it. According to SUSv4:

  If no next line of input is available, the N command verb shall branch to
  the end of the script and quit without starting a new cycle or copying the
  pattern space to standard output.

It explicitly says _without_ copying the pattern space to stdout, so gnu/dammit
is doing a thing that posix says not to do. The most concise version of it is:

  $ echo -e 'a\nb\nc' | gnu-sed 'N;d'
  c
  $ echo -e 'a\nb\nc' | toybox sed 'N;d'

Both jump to the end, but sed prints (in the absence of -n), and posix says
"don't". I'll happily make undefined behavior go a certain way, and even
argue with things like \n being always parsed as a newline (when you can use n
as a regex delimiter then escaping the delimiter character makes it the
literal chracter, so \nba\ngn should be literal "bang" not "ba\ng" even
though the way posix is phrased, \n as newline always has priority. I think
posix got the order wrong, to the point the gnu/dammit version goes out of its
way to make \n behave differently than \t and such, which is _silly_.)

But "you shall explicitly not do this" seems reasonably clear?

Rob

(I say "gnu/dammit" because of their "GNU owns Linux, dammit, therefore you
must refer to it as gnu/linux/dammit, because we may have came up with the
idea of cloning unix after the Mark Williams Company and in parallel to BSD,
but we announced our vaporware hurd project _before_ Minix shipped their
complete clone including a compiler, therefore since Linux started life as a
clone of minix on comp.os.minix, we own that idea and they belong to us!!1!one"
campaign. When you aren't specifically talking about Linux, the
"gnu/linux/dammit" campaign collapses to gnu/dammit, as in the gnu/dammit
implementation of sed.)

 1416168751.0


More information about the Toybox mailing list