[Toybox] toybox - added cmp

Frank Bergmann toybox at tuxad.com
Thu Feb 9 09:00:57 PST 2012


Hi.

On Thu, Feb 09, 2012 at 05:46:50AM -0600, Rob Landley wrote:
> When get_line() reads data, it can't know where the terminator character
> occurs ahead of time.  So either it makes sure not to read past the end
> of the line (by reading one character at a time), or it has to store the
> beginning of the line somewhere until next time we need it.
> 
> The problem is, "next time we need it" is not well-defined.  Think about
> xargs' -E option, specifying the end of file string.  Can xargs then
> pass through the rest of stdin to the program it execs?  Not if it
> already read an arbitrary-sized chunk of it into an internal buffer so
> further reads from the filehandle start who-knows-where.

Can you please explain it more in detail? Do you talk about xargs itself
or the command(s) it will execute?
If xargs uses buffered input it can use its buffer as long as the process
exists (and not something stupid like fclose(stdin) was done).
This reminds me stdio. Please look at this example:

[fwb at vdr toybox]$ cat example-setvbuf.c 
#include <unistd.h>
#include <stdio.h>
#define BUFSIZE 8
#define ERRORSTRING "error on setting buf\n"
int main()
{
  char inbuf[BUFSIZE+1];
  char outbuf[BUFSIZE+1];
  int c;
  if (setvbuf(stdin, inbuf, _IOFBF, BUFSIZE)) goto error;
  if (setvbuf(stdout, outbuf, _IOFBF, BUFSIZE)) goto error;
  for(;;) { c = getc(stdin); if (c==EOF) break; printf("%c\n", c); }
  fflush(stdout);
  _exit(0);
error:
  write(2, ERRORSTRING, sizeof ERRORSTRING - 1);
  _exit(1);
}
[fwb at vdr toybox]$ echo -e "1\n2\n3\n1\n2\n3" |strace -o strace.out
./example-setvbuf >/dev/null
[fwb at vdr toybox]$ cat strace.out 
execve("./example-setvbuf", ["./example-setvbuf"], [/* 29 vars */]) = 0
open("/dev/urandom", O_RDONLY)          = 3
read(3, "\354\2451\224D\204\336y\4\323", 10) = 10
close(3)                                = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0xbf97a810,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
limit_in_pages:1, seg_not_present:0, useable:1}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbf97a784) = -1 EINVAL (Invalid
argument)
read(0, "1\n2\n3\n1\n", 8)              = 8
write(1, "1\n\n\n2\n\n\n", 8)           = 8
read(0, "2\n3\n", 8)                    = 4
write(1, "3\n\n\n1\n\n\n", 8)           = 8
read(0, "", 8)                          = 0
write(1, "2\n\n\n3\n\n\n", 8)           = 8
_exit(0)                                = ?

As you see I got fully buffered stdin and stdout. I can even use the
simple getc() and got it buffered. I can't see the point where this may
break xargs (or the command it calls?) even with -E and "-n 1".

> bytes at a time is 100 times faster than one byte at a time).  But if
> something later wants a filehandle instead of a file pointer, which

Of course the bytes in the buffer are lost or you must use a wrapper for
using stdio and raw io. :-)

> includes all child programs that want to read from the same filehandle...

IMHO I'm missing the point. Can you please explain it more in detail? If a
process spawns a child he may also control its standard file descriptors.

> There are ways around this: the xargs example could instead create a
> pipe, feed the rest of its buffer into that, and then pass along further
> data fila a select/poll loop a bit like netcat does.  But this requires
> an extra process stick around to pass along the data instead of merely
> doing an exec().

Don't understand this, too. For what reason a pipe? I guess
it's better if you post some example code (or example pseudo code). :-)

Frank

-- 
EDV Frank Bergmann                           Tel.     05221-9249753
LPIC-3 Linux Professional                    Fax      05221-9249754
Pödinghauser Str. 5                          email    iservice at tuxad.com
32051 Herford                                USt-IdNr DE237314606

 1328806857.0


More information about the Toybox mailing list