[Toybox] [PATCH] netcat: make -l exit after handling a request

Sun Jun 25 12:14:01 PDT 2017

On 06/09/2017 04:22 PM, Josh Gao wrote:
> On Fri, Jun 9, 2017 at 1:31 PM, Rob Landley <rob at landley.net
> <mailto:rob at landley.net>> wrote:
> 
>     On 06/09/2017 01:05 PM, Josh Gao wrote:
>     > Patch attached.
> 
>     What specifically did you fix? (I.E. is this the same "netcat -L
>     zombies" issue I have on my todo list?)
> 
> 
> This makes `nc -l` do what it says it does, and listen for exactly one
> connection before exiting, instead of handling connections until a ^C.

Ok, getting around to looking at this one (sorry for the delay).

I have a longish todo list for netcat, the oldest entry in which is that
commit f492fccc9ceb was sort of awkward, and I'm pretty sure I POSTED
about why it was awkward but the second half of
http://lists.landley.net/pipermail/toybox-landley.net/2015-January/date.html
is of course missing from the archive because Dreamhost is incompetent.
(I think there are now four holes in the archive because of them? I'd
have to check my notes. http://landley.net/dreamhost.txt and
http://landley.net/dreamhost2.txt have plenty more I haven't bothered to
put online because what's the point?)

Anyway, the problem I have with that commit is now this:

  netcat 127.0.0.1 80 -t

doesn't work. It treats -t as a third non-option argument because it hit
a non-option argument first and stopped parsing. But it ONLY does that
if you enable server mode! If you build without server mode, it works fine.)

Originally I had it so that hitting -l or -L would stop the option
parsing (there's an optstr "l^" syntax), but then somebody did (and I
quote from their email):

  netcat -l ls -p 1234

And they were surprised it didn't work. (Sigh. It still doesn't, but now
it doesn't work in the xargs way, which I assume more people understand?
Except xargs doesn't parse any non-option arguments _other_ than the
command line to run, and netcat more or less requires them in dialout
mode...)

Other netcat todo items include:

1) switching it to use xconnect() which it predates, and which is hard
because the various users in tree all want slighty different things out
of the getaddrinfo() plumbing and I've made a couple attempts to
unify/genericize it but keep getting pulled alway by $DAYJOB crisis du
jour halfway through and forgetting what design problem details I was
halfway through solving and have to start over again...

2) adding UDP mode so it can handle netconsole, and then deleting
tcpsvd.c out of pending (which was a third party submission for
functionality I'd always planned to have netcat implement).

3) converting netcat to use xvfork() so the fiddly sequencing in commit
4e867b8a3527 isn't necessary anymore. (See the comment "gcc's insane
optimizer can overwrite this", and the comment about liveness analysis
below that. I posted about this to the list back in October, and also
blogged about it but I fell behind editing my blog and restarted in
january so that entry's not actually uploaded; that gap isn't
dreamhost's fault it's me not having enough hours in the day since my
startup ran out of money last summer and those of us left took on extra
responsibilities to cover the lost headcount. It's on the todo list...)

The vfork() issue may actually be a musl-libc bug because built against
musl vfork() hasn't got attribute returns_twice and thus gcc is
"optimizing" stack usage next time it makes a function call, not
protecting local variable values it thinks it won't be reused, and thus
when the parent resumes it has garbled local variables unless you hit it
with a rock like that commit did. But XVFORK() uses a wrapper function
that _is_ annotated returns_twice and thus it should work reliably
whatever musl's headers do?

(Ok, it doesn't really, the vfork() call is an argument to the
returns_twice function, so vfork is called and the result is passed into
that function call, twice. But that should be close enough? Ok, I can
imagine a pathological "optimizer" moving the vfork() call up to the
start of the outer function, calling it once, and cacheing its value. I
wouldn't put that kind of lunacy past modern gcc. But given I can't
#undef a function prototype to redo it, and can't return from a function
after vfork, this is the closest I can come up with to make it work?
This is half the reason the linux kernel craps memory barriers all over
the place, to block gcc's optimizer from doing stupid reordering of
straightforward C...)

>     terminal1$ echo foo | toybox nc -l -p 1234 > bar
>     terminal1$ cat bar
>     bar

Your call there wouldn't work with the "stop after -l" version either,
the -p would have to come before the -l. :P

>     terminal2$ echo bar | toybox nc localhost 1234 > foo
>     terminal2$ cat foo
>     foo

Um... yes? That's what it's supposed to do, and what it's doing without
this patch?

Your first hunk told it to run an instance of netcat in chat mode (no
command to run, so instead stdin and stdout of netcat get forwarded to
the connection). This will block until the connection is made, although
if you're redirecting the input you can background it if you want to.

The second hunk then made a connection. The "bar" you wrote to the
second instance got written to the first instance's output, and the
"foo" you wrote to the first instance got forwarded to the second
instance's output. When each instance gets EOF from stdin it calls
shutdown(fd) to let the other instance know, so both instances exit.

I just built netcat from clean toybox, and it did that. I don't see the
problem?

> (Unrelated: is matching the BSD netcat's command line interface a goal?

I hadn't planned on it?

I wrote my first netcat in 2001 when there WAS no standard version, just
a source posting from somebody named "Frodo" I think? My old one's
probably still on the net somewhere... yup.

  http://dvpn.sourceforge.net/old/netcat.c

I then ripped busybox's version a new one a few years later because it
just got stuff _wrong_ (for example it didn't use shutdown() on the
connection to pass along EOF info which meant if you did "echo GET / |
netcat cnn.com 80" it wouldn't work, which at the time was still
relevant). I also made server mode work and so on. At the time, linux
distros still didn't install a netcat by default, so the functionality
busybox implemented was new. (My busybox mount was also the first one to
autodetect when to use loop devices so you didn't have to say "-o loop".)

But people expected busybox implementations of commands to be inferior
rather than to ever be _more_ capable than others, so even though I
submitted my "count" program to busybox in 2003
(http://lists.busybox.net/pipermail/busybox/2003-October/043590.html
which got bikeshedded into pipe_progress) which I'd been using since I
wanted to know how long imaging disks too while working at boxxtech in
2000, that didn't a "pv" package from showing up later and being
considered the "real" one that busybox must obviously have been
retroactively copying.)

Sometimes I bow to this stuff. I added "touch -l" to set the length of a
file in 2007 (since there was no obvious way to do it), the FSF added a
"truncate" command in 2009, I noticed and implemented the new command in
2011:

  http://landley.net/notes-2011.html#08-06-2011

But my ubuntu 14.04 command line goes:

  $ netcat --help
  netcat: invalid option -- '-'
  This is nc from the netcat-openbsd package. An alternative nc is
  available in the netcat-traditional package.

And I'm pretty sure I installed that netcat manually? (It says
"Automatically installed: no".) Which implies  there _isn't_ a netcat in
the base os, or wasn't a while back.

Meanwhile, Denys Vlasenko made a different call back in 2009:

  http://lists.busybox.net/pipermail/busybox/2009-July/069920.html

Yes, busybox has two versions of netcat now. Because nothing says small
and simple like having multiple implementations of the same command...

I note that the "netcat 1.10" he added was released January 2, 2007, and
the last commit I did to the busybox version before I left was
d921b2ecc0d2 on August 3 2006, so once again I get dinged for not being
compatible with something that wasn't released until after I wrote mine.

I'm aware there is a "gnu netcat". That is the extent of my knowledge on
that subject, and I'm pretty happy with that decision.

> I keep accidentally doing `toybox nc -l 1234` and getting bitten by it.)

Um, what do you expect that to do?

What it's _trying_ to do is tell you the port number it auto-selected
before backgrounding the -l command ready to run "1234". Since you
didn't specify a -p the network layer will auto-select a port, printing
it lets you do PORT=$(netcat -l blah) and then use that port number in a
script.

It does -p and -s for consistency: you specify the local port and local
address the same way dialing out as in sever mode. And if you don't it
provides defaults (and announces them):

  $ ./netcat -l echo hello
  40351
  $ pgrep netcat
  2298
  $ ./netcat 127.0.0.1 40351
  hello
  $ pgrep netcat
  $

>     I'm -> <- this close to getting a release out, the only issue left after
>     dmesg is fixing the help text generator, lemme do that, cut a release,
>     and _then_ dive into this. :)
> 
> Sure, thanks :-)

I got the release out, and now it's the weekend I'm digging out some
backlog, but I'm still not quite sure what this patch fixes?

Having two pollinate() calls (and thus two alarm(0) calls) is fallout
from that stupid vfork weirdness, part of the XVFORK() work is unifying
that again (and testing it with llvm, current gcc, and older gcc, on
multiple architectures) to make sure my returns_twice theory holds up.
(Supporting nommu and non-nommu with the same code path is fiddly). The
need for that much testing is why I've been getting mkroot ready to be a
testing environment before revisiting this, and also why I'm reluctant
to apply this patch without a better idea what it's supposed to fix...

Hmmm... another issue, if xexec() doesn't actually exec sockfd() gets
inherited by the child. It should explicitly close it the way it's doing
in1...

Lemme clean up my pending issues and see if you still have yours?

> -Josh

Thanks,

Rob