[Toybox] [PATCH] tail fixes

Felix Janda felix.janda at posteo.de
Fri Sep 6 14:48:10 PDT 2013


Rob Landley wrote:
> On 08/31/2013 05:53:09 AM, Felix Janda wrote:
> > This should now hopefully fix the earlier segfaults.
> > 
> > In the case that lseek doesn't work and we count from the end
> > the algorithms for tail -n and tail -c are now more separate.
> > (tail -c is now more efficient since it adds lenghts instead of
> > counting bytes.)
> 
> And made it back here with my email client rather than snooping in  
> gmail's web interface. Time for a closer look...
> 
> > POSIX tail BTW does only accept one input file. tail is not specified
> > by LSB. I guess it is convenient when combined with "-f" to watch
> > multiple files.
> 
> I treat posix as a frame of reference to diverge from. The gnu/dammit  
> guys take credit for every third party contribution from 20 years of  
> Linux development (and some disgruntled solaris guys back in the 80's  
> when Ed Zander decided to unbundle the compiler from the OS and charge  
> extra for it, and they all installed gcc instead).
> 
> I care about "what's on my existing linux box", both the actual current  
> Ubuntu LTS, and the old 2008 version I have in a qemu image, and the  
> various gentoo/fedora/suse/decade old red hat 9" test images I have  
> lying around. Some of it's crazy and it's better not to do that, but  
> it's a point of reference. (So is Android, and busybox, and the linux  
> standard base...)
> 
> What The Hurd implements is not. Those religious fanatics are simply  
> not relevant, except sometimes as a warning of what _not_ to do.

I just wanted to mention this. It means that we have some more freedom
in the implementation.

> > For tail -f, how about putting the fds into an array and then using
> > select() to watch the fds?
> > 
> > GNU and busybox print the header again when the file last read from
> > changes. They behave differently when a file occurs on the command
> > line more than once.
> > 
> > # HG changeset patch
> > # User Felix Janda <felix.janda at posteo.de>
> > # Date 1377945041 -7200
> > # Node ID 3cd61d16aabec82505da96a142663cedbba8d12a
> > # Parent  9a059e9b3a17a372fc8b225f512af57c72f4eeaa
> > tail: Some fixes
> > 
> > - Rewrite most of the not lseek() logic
> > - Change meaning of len in line_list
> > - Use single instead of double linked list
> > 
> > diff -r 9a059e9b3a17 -r 3cd61d16aabe toys/posix/tail.c
> > --- a/toys/posix/tail.c	Fri Aug 30 20:09:10 2013 +0200
> > +++ b/toys/posix/tail.c	Sat Aug 31 12:30:41 2013 +0200
> > @@ -38,19 +38,20 @@
> >  )
> > 
> >  struct line_list {
> > -  struct line_list *next, *prev;
> > +  struct line_list *next;
> 
> Double lists are easy to create in order, I usually do that when  
> sequencing is important.

I was not aware of this. prev just seemed like another thing to
keep track of. do_tail previously unecessarily kept track of it
when drop no longer necessary chunks.

> >    char *data;
> > -  int len;
> > +  size_t len;
> >  };
> 
> You changed this from a signed to an unsigned value why? The largest  
> value fed into get_chunk is sizeof(toybuf), which easily fits in an  
> int. (If we're allocating a larger than 2 gigabyte block of data,  
> something is wrong.)

I just feel more comfortable with modular arithmetic than possibly
undefined behavior. In this case it doesn't really matter.

> > -static struct line_list *get_chunk(int fd, int len)
> > +static struct line_list *get_chunk(int fd, size_t len)
> >  {
> >    struct line_list *line = xmalloc(sizeof(struct line_list)+len);
> > 
> > -  line->data = ((char *)line) + sizeof(struct line_list);
> > +  line->data = (char*)line + sizeof(struct line_list);
> >    line->len = readall(fd, line->data, len);
> > +  line->next = 0;
> 
> Hmmm... the caller was supposed to set this. I see the caller doesn't  
> do so reliably for the end of the list and that is a bug. Quite  
> possibly the rest of this patch is noise?

Some large part. Sorry.

> > -  if (line->len < 1) {
> > +  if (line->len + 1 < 2) {
> 
> It was signed for a reason; readall returns -1 on error, and depending  
> on wraparound like that is nonobvious and awkward.

Unless you want to do a signed comparison there is no problem of storing
-1 in an unsigned variable. -1 makes sense in any ring.

But just revert this, if it makes you feel uncomfortable.

> >      free(line);
> >      return 0;
> >    }
> > @@ -61,7 +62,9 @@
> >  static void dump_chunk(void *ptr)
> >  {
> >    struct line_list *list = ptr;
> > -  xwrite(1, list->data, list->len);
> > +  size_t len = list->len - (list->data - (char*)list - sizeof(struct  
> > line_list));
> > +
> > +  xwrite(1, list->data, len);
> >    free(list);
> >  }
> 
> Actually the point of dump_chunk() was to output whole chunks. The  
> previous code should have either written up to the end of the previous  
> chunk or adjusted the start and length of the chunk for us.
> 
> (It's a llist_traverse() callback function to output all pending  
> buffered data.)

That was just a horrible attemp to shave off a line. See my other mail
on how to fix it.

> Sigh, it is _annoying_ to have two codepaths...

They are quite different after all. There even two code paths in
do_tail() and my patch introduces another one...

> > @@ -71,11 +74,11 @@
> >  static int try_lseek(int fd, long bytes, long lines)
> >  {
> >    struct line_list *list = 0, *temp;
> > -  int flag = 0, chunk = sizeof(toybuf);
> > -  ssize_t pos = lseek(fd, 0, SEEK_END);
> > +  int flag = 0;
> > +  size_t chunk = sizeof(toybuf), pos = lseek(fd, 0, SEEK_END);
> 
> pos was a ssize_t because lseek() can return failure, now you've made  
> it a size_t which can't be negative. (Do you have something against  
> signed numbers?)

It can't be negative. But it can be -1. But actually we should use
an off_t for pos. One might be curious about the end of a 8GB file
in a 32bit system.

> >    // If lseek() doesn't work on this stream, return now.
> > -  if (pos<0) return 0;
> > +  if (pos == -1) return 0;
> 
> And yet you compare it with a negative. Maybe the bit patterns work  
> out, but I'm uncomfortable with that. I can see a compiler optimizing  
> this out simply because it can never be true. (That's actually what  
> happens when pos is unsigned char, due to type promotion.)

You are right. It should be -1u. (This can only be optimized out if
sizeof(size_t) < sizeof(int) (is that possible?). Otherwise there are
two cases: If pos can be represented as an int, it will be converted
to an int and we have a signed comparison. Otherwise -1 is promoted
to size_t and an unsigned comparison happens. The implementation of
signed numbers is irrelevant.)

> >    // Seek to the right spot, output data from there.
> >    if (bytes) {
> > @@ -88,40 +91,36 @@
> > 
> >    bytes = pos;
> >    while (lines && pos) {
> > -    int offset;
> > +    size_t offset;
> 
> This offset is into the current chunk. Chunks are page size, and offset  
> is initialized to list->len which is at most sizeof(toybuf). So int is  
> plenty big for this.

Yes.

> >      // Read in next chunk from end of file
> > -    if (chunk>pos) chunk = pos;
> > +    if (chunk > pos) chunk = pos;
> >      pos -= chunk;
> >      if (pos != lseek(fd, pos, SEEK_SET)) {
> >        perror_msg("seek failed");
> >        break;
> >      }
> >      if (!(temp = get_chunk(fd, chunk))) break;
> > -    if (list) list->next = temp;
> > +    if (list) temp->next = list;
> >      list = temp;
> 
> Actually you could unconditionally temp->next = list and that takes  
> care of null terminating the list if temp started null.

Yes. (currently your if clause is false.)

> And, yes, I believe that fix is needed. :) (This instance is creating a  
> list in reverse order because we're reading the file in reverse order,  
> thus it doesn't need a doubly linked list.)

Right.

> >      // Count newlines in this chunk.
> >      offset = list->len;
> >      while (offset--) {
> >        // If the last line ends with a newline, that one doesn't  
> > count.
> > -      if (!flag) {
> > -        flag++;
> > -
> > -        continue;
> > -      }
> > +      if (!flag) flag++;
> >        // Start outputting data right after newline
> > -      if (list->data[offset] == '\n' && !++lines) {
> > +      else if (list->data[offset] == '\n' && !++lines) {
> 
> Eh, ok.
> 
> >          offset++;
> >          list->data += offset;
> > -        list->len -= offset;
> 
> offset was initialized to list->len which can't be zero or get_chunk()  
> would have discarded it. Therefore, offset should never be > list->len  
> so this can't reduce it to a negative number.
>
> Yet you decided to not adjust the length here, and instead complicate  
> dump_chunk with knowledge of the structure layout. Why?

I gave a (very lame) explaination above.

> > -        break;
> > +        goto done;
> >        }
> 
> Which breaks us into a while(lines) loop, and we only got in here if  
> !++lines so lines has to be 0 _after_ the increment, so the outer loop  
> must also exit, so the goto is unnecessary.

Sorry, that's a remnant of a previous version. (I wasn't sure whether the
code was right in the case that the last line doesn't end with a newline.)

> >      }
> >    }
> > 
> > +done:
> >    // Output stored data
> >    llist_traverse(list, dump_chunk);
> > 
> > @@ -143,58 +142,54 @@
> >    // Are we measuring from the end of the file?
> 
> I thought the non-lseek tail case was already working? (Are there bugs  
> in both?)

In some cases... On the contrary, I had produced a reproducible segfault
in the non-lseek logic and unexpectedly found a bug in the lseek logic.

> The diffstat on this last hunk is:
> 
>   onen--- |   64  
> ++++++++++++++++++++++++++++++----------------------------------
>   1 file changed, 30 insertions(+), 34 deletions(-)
> 
> So a net savings of 4 lines, while adding a new intermediate structure.  
> Previously we had one data structure (a linked list of chunks), now  
> there's a second structure wrapping that other structure, an array  
> containing pointers to nodes of a linked list (that's still a linked  
> list).
> 
> Not sure that's actually simpler. Or faster. Or uses less memory.
>
> Sigh. Large rewrite of how something works, including a lot of  
> unrelated changes, is not something I'm happy to see presented as a  
> bugfix. Especially when I'm trying to dig an answer to "so what was the  
> actual _bug_?" out of it, so I can confirm whether or not it was  
> actually fixed.

The previous code had the assumption that every chunk is terminated by
a newline (which is utterly wrong). This is the simplest implementation
I could come up. I'll try to explain further things below.

After spending quite some time with this reply I see the simple
(less efficient (in terms of number of instructions)) algorithm.

> I am exerting a lot of pondering answering what should be a simple  
> question.
> 
> >    if (bytes<0 || lines<0) {
> > -    struct line_list *list = 0, *new;
> > +    struct line_list *head = 0, *tail, *new;
> > +    // circular buffer of lines
> > +    struct {
> > +      char *start;
> > +      struct line_list *inchunk;
> > +    } *l = xzalloc(2*-lines*sizeof(void*));
> > +    int i = 0, flag = 0;
> > +    size_t count, len = bytes;
> 
> malloc(0) is allowed to return NULL, which means xmalloc() will abort,  
> so in some libc implementations -c just stopped working.

That's bad. Could one declare it as an array on the stack instead?

> >      // The slow codepath is always needed, and can handle all input,
> >      // so make lseek support optional.
> > -    if (CFG_TAIL_SEEK && try_lseek(fd, bytes, lines));
> > +    if (CFG_TAIL_SEEK && try_lseek(fd, bytes, lines)) return;
> >      // Read data until we run out, keep a trailing buffer
> > -    else for (;;) {
> > -      int len, count;
> > +    for (;;) {
> 
> Eh, cosmetic, but ok.
> 
> >        char *try;
> > 
> >        if (!(new = get_chunk(fd, sizeof(toybuf)))) break;
> >        // append in order
> > -      dlist_add_nomalloc((void *)&list, (struct double_list *)new);
> > +      if (head) tail->next = new;
> > +      else head = new;
> > +      tail = new;
> 
> Open code the function, adding more local variables? I can sort of see  
> it, but I tend to lean on the dlist functions because (as you can see  
> from above), manual linked list code is easy to screw up and hard to  
> read. (It's 8 extra bytes per 4k chunk to use common code, and you were  
> comfortable adding an entire array of structures here...)

I just thought simple linked lists were easier to keep track of. I see
your point. (Again more about the structures below.)

> > -      // Measure new chunk, discarding extra data from buffer
> > -      len = new->len;
> >        try = new->data;
> > -      for (count=0; count<len; count++) {
> > -        if ((toys.optflags & FLAG_c) && bytes) {
> > -          bytes++;
> > -          continue;
> > +      if (lines) for (count = 0; count < new->len; count++, try++) {
> > +        if (flag) { // last char was a newline
> > +            while (l[i].inchunk && (l[i].inchunk!=head))  
> > free(llist_pop(&head));
> > +            l[i].inchunk = tail;
> > +            l[i].start = try;
> > +            i = (i + 1) % -lines;
> > +            flag = 0;
> >          }
> > -
> > -        if (lines) {
> > -          if(try[count] != '\n' && count != len-1) continue;
> > -          if (lines<0) {
> > -            if (!++lines) ++lines;
> > -            continue;
> > -          }
> > +        if (*try == '\n') flag = 1;
> > +      } else { // bytes
> 
> Actually we don't have an exclusion in the option string for [!nc] and  
> I vaguely recall I was trying to make it work if you specified both at  
> the same time. (So it showed you that many characters from the end  
> _plus_ that many lines before that point.)

I didn't know that. Could you specify it in more detail (the order matters)?

> That's why -c ate its argument (counting up), and then -n ate its  
> argument (counting up) in sequence. However, other implementations  
> instead make "tail -c 7 -n 3" act like just -n 3, and "tail -n 3 -c 7"  
> act like -c 7. And the way to do _that_ is a [-cn] exclusion on the  
> string, then check the flag.
> 
> (Hmmm... ideally I'd have lib/args.c zero the save slot when it unsets  
> the flag, then we don't have to check the flags... Ok, I think I can  
> make that work without too much extra code.)
>
> > +        if (len + new->len < len) flag = 1; // overflow -> have now  
> > read enough
> > +        for (len += new->len; flag && (len - head->len < len);) {
> > +          len -= head->len;
> > +          free(llist_pop(&head));
> >          }
> > -
> > -        // Time to discard data; given that bytes and lines were
> > -        // nonzero coming in, we can't discard too much if we're
> > -        // measuring right.
> > -        do {
> > -          char c = *(list->data++);
> > -          if (!(--list->len)) {
> > -            struct line_list *next = list->next;
> > -            list->prev->next = next;
> > -            list->next->prev = list->prev;
> > -            free(list);
> > -            list = next;
> > -          }
> > -          if (c == '\n') break;
> > -        } while (lines);
> >        }
> >      }
> > +    if (lines) head->data = l[i].start;
> > +    else head->data += len;
> > 
> >      // Output/free the buffer.
> > -    llist_traverse(list, dump_chunk);
> > +    llist_traverse(head, dump_chunk);
> > 
> > +    free(l);
> 
> Sigh. The previous design was _just_ doing linked list traversal, and  
> it was pretty much the same sort of linked list traversal in both  
> codepaths.
> 
> This change has made the two codepaths have fundamentally different  
> designs, taking them further away from being unified. (Not that they  
> necessarily can be unified, but the more different the two  
> implementations are the harder the command as a whole is to understand.  
> This makes me uncomfortable.)

The lines variant just needs to keep track of much more data so I didn't
want to burden the bytes method with that. (And the count++ seemed quite
inefficient for the bytes.)

> >    // Measuring from the beginning of the file.
> >    } else for (;;) {
> >      int len, offset = 0;
> 
> I thought the existing design was reasonably straightforward:
> 
>    Loop reading chunks:
>      1) Count up requested bytes until full
>      2) Count up requested lines until full
>      3) Once we have enough data but not the _right_ data,
>         discard from head as we add to tail.
>    Print result.

Now it makes some sense to me. The "Time to discard..." section is
however wrong. If we have enough data, so that lines = 0, the loop
is only run once. It seems to me that in this case for each new byte
we add to the data at the end we drop one in the beginning. How should
this in the end when we reach the end of the file tell us the right
point to output the lines?

> The implementation might not have been, but we shouldn't need more  
> structures to track this. The old approach even had reasonably good  
> cache locality. I can see objecting to incrementing bytes++ each time  
> for large -c values (didn't want to special case it and L1 cache local  
> spinning on a register to count up to even a few million should be way  
> less than the read() overhead, but it still wastes battery...)

The extra structure keeps track of the lines found so far. It remembers
the start of each in line and because of the existence of segmented
memory it also has to remember which chunk the line starts in. Using
the structure there is no need to reread the list to find out where the
lines were located.

> Let's see, assuming the actual bugs were the uninitialized next pointer  
> in the last list entry and the list entries being added in the wrong  
> order (so only the last list entry every showed), so if I separate the  
> count and line codepaths into an if/else but keep the head/tail  
> traversing design without this sideband line array...

There were more. (Just try the previous lseek logic with something big
containing no newlines.)

>   toys/*/tail.c | tail -n 87) | diffstat
>   one |   66  
> +++++++++++++++++++++++++++++++-----------------------------------
>   1 file changed, 31 insertions(+), 35 deletions(-)
> 
> Heh, same number of lines. And I added comments...
> 
> Now this is cheating slightly because I added a dlist_pop() function  
> that adjusts the prev and next pointers to lib/llist.c. (The two  
> different code paths each discard chunks, so a function is better than  
> inline.) I could instead do the two pointer thing you're doing to  
> create a singly linked list inline, but I'd want to make a generic  
> function to do that and add it to lib/llist.c instead of open coding it  
> here, and last I tried it that required having a magic list head  
> structure which I'd rather not do...

Do you really understand the "Time to discard data;..." logic? The
comment was not really helpful for me.

> (Sorry, I've been vaguely meaning to come back to tail and see if I  
> could clean this up and unify the codepaths. The backwards traversal vs  
> forwards traversal thing is what doubly linked lists are _for_. But  
> every time I sit down to bang on toybox I've got five more messages  
> from people submitting more code for me to look at, and I never get to  
> any of my own todo list because I'm perpetually behind...)
> 
> Grrr, all right, let's confront the design issue you've raised: am I  
> ever using the doubly linked list for anything _except_ in-order singly  
> linked list creation? Let's see, the current users are ls, xargs, tail,  
> patch, and sed.

Just to sketch what I was thinking: I noticed that no doubly-linked list
was strictly necessary in the code. Then I found that there is no
llist_add_nomalloc. Finally I concluded that it must be so trivial that
it should be done inline...

> ls: is abusing a dirtree * to do in-order creation via  
> dlist_add_nomalloc() (with ->parent standing in for ->prev) and then  
> cleaning it up with dlist_to_dirtree() function that's only ever called  
> from one place... that could use some cleanup. Anyway, it's never using  
> prev, just writing to it or breaking the circular list with it.
> 
> patch: in order creation, using prev to terminate teh circular list to  
> get a singly linked list, and at one point removing an item while  
> maintaining the circular list (so adjusting the pointers, but that's  
> write not read).
> 
> xargs: in order creation, never using prev except to break the chain.
> 
> sed: unifinished. I'd love to get time to work on it but tonight I'm  
> doing this instead.
> 
> tail: case in point.
> 
> So it _looks_ like the dlist stuff can go, and be replaced by a  
> function that does in order creation of singly linked lists. (If I  
> really need something that can be traversed both ways, so far it's  
> either an array or a tree.)
> 
> Right. What does said function look like? Because:
> 
>    llist_add_tail(void *list, void *end, void *new);
> 
> Is... kinda sad. It only writes to list if it was null (initializing an  
> empty list), should that be the caller's job? Can it _return_ list?
> 
>    list = end = 0;
> 
>    while (new = get()) list = llist_add_tail(&end, new);
> 
> No, then it doesn't know what list was after the first call, I've got  
> to pass it in so it can check if it's not NULL or have the caller do  
> that. The inline version is something like:
> 
>    struct listy *list = 0, *end = 0, *new;
> 
>    while (new = get()) {
>      if (!list) list = new;
>      else end->next = new;
>      new->next = 0;
>      end = new;
>    }
> 
> And I wrap that in a function how?
> 
> Sigh:
> 
>    struct listy *list = 0, *new;
> 
>    while (new = get()) dlist_add_nomalloc(&list, new);
> 
> You see why I just did a doubly linked list? That way I had one list  
> pointer fed to the function and the function could use it to find the  
> end of the list, but I _didn't_ have different list and list head  
> types. The downside is each node has an extra pointer, for which I get  
> the ability to traverse the list backwards, even if I never use it.)

It's still a bit confusing to me. list always points to the head and
list->prev points to the tail? Could you maybe remark that somewhere?
So it's nearly the same as the three argument version. (But we get a
doubly linked list in the end.)

> On the bright side, if I do keep dlist_pop() I can use it in patch to  
> trim a couple lines there...
> 
> Eh, I should worry about it in the morning. Bedtime, and still 5000  
> emails behind. (Mostly linux-kernel, but I need to read the  
> Documentation stuff in there. Plus the 3.11 kernel dropped on tuesday  
> and I need to do an Aboriginal release, which means a toybox release  
> _and_ I wanted to get LFS 7.4 built instead of the old 6.8 control  
> image that's years stale at this point...)

Sorry for keeping you up with this patch.

Felix

 1378504090.0


More information about the Toybox mailing list