[Toybox] grep corner cases

Felix Janda felix.janda at posteo.de
Wed Aug 21 13:42:29 PDT 2013


Rob Landley wrote:
> On 08/20/2013 12:29:08 PM, Felix Janda wrote:
> > Rob Landley wrote:
> > > On 08/19/2013 02:26:55 PM, Felix Janda wrote:
> > > > Hi,
> > > >
> > > > I saw the comment in changeset 1017 on possible bugs in GNU grep.
> > > >
> > > > The failing tests are for me:
> > > >
> > > > testing "grep -vo" "grep -vo one input" "two\nthree\n"
> > > > "onetwoonethreeone\n" ""
> > > > testing "grep -Fx ''" "grep -Fx '' input" "one one one\n" "one one
> > > > one\n" ""
> > > > testing "grep -F -e blah -e ''" "grep -F -e blah -e '' input" "one
> > > > one one\n" \
> > > >   "one one one\n" ""
> > > >
> > > > -o is a GNU extension making grep only output the matched parts of
> > > > each
> > > > matched line. So since -v inverts the set of all matched lines  
> > grep
> > > > -vo
> > > > should not output anything.
> > >
> > > Does it invert the set of matched _lines_, or does it invert the  
> > match
> > > criteria? I made it so that:
> > 
> >      -v
> >      Select lines not matching any of the specified patterns. If the
> >      -v option is not specified, selected lines shall be those that
> >      match any of the specified patterns.
> > 
> > Does sound to me like the former. This fits the line based nature of  
> > many
> > of the POSIX tools. It however doesn't make grep -vo very useful.
> 
> Posix does not have the -o option. The -o option is not line based.  
> This is about the effect of other options on the -o option.

Since you still can't match things spanning lines -o doesn't seem to
make grep into a byte based tool.

What should

echo on | grep -vo a

output? If you say that the sense of matching is inverted shouldn't
the output be


o
n
on

or some permutation thereof? The current output of toybox is pretting
interesting...

> > >    echo oneandtwoandthree | grep -ov
> > 
> > Shouldn't it be
> > 
> >     echo oneandtwoandthree | grep -ov and
> 
> Yes.
> 
> > > would produce:
> > >    one
> > >    two
> > >    three
> > >
> > > (I pondered onetwothree but that's not how -o without -v works...)
> > >
> > > The reason there are deviating test cases to consider is I'm not  
> > taking
> > > "what gcc does" as an inherent definition of "the right thing to  
> > do".
> > 
> > But maybe it's a reason to spend some thought on the validity of the
> > test case and maybe do some testing against other implementations. For
> > example busybox grep also doesn't output anything.
> 
> What other implementations? -o is a gnu/dammit extension.

Did you read the last paragraph you are quoting carefully?

Just FYI: obase doesn't contain grep.

> > > That implies that
> > >
> > >    echo one | grep -F -e walrus -e ''
> > >
> > > Should match one, but with the gnu/dammit version it only does so
> > > _without_ the -F. Or with -F and just one argument...
> > 
> > busybox grep also agrees with you.
> 
> And it's inconsistent:
> 
>    $ echo hello | grep -F -e ''
>    hello
>    $ echo hello | grep -F -e 'one' -e ''
>    $ echo hello | grep -e 'one' -e ''
>    hello
> 
> (That's pretty clearly a bug. If you're wondering why I'm not slavishly  
> copying gnu/dammit behavior it's because they're not very good at this.  
> They've just had a very large testing base reporting bugs for a very  
> long time.)

I don't care specifically about the GNU version. With GNU grep lying in
/usr/bin it's just too convenient to run tests against it.

> > > > Combined with -x IMO
> > > > it should only match empty lines.
> > >
> > > I asked the Austin guys how -F and -x interact. It's not obvious to  
> > me
> > > from reading it.
> > 
> > Sounds like a good idea.
> 
> They said -x nukes '' being a wildcard (which matches the non-F version  
> of the gnu/dammit behavior), but BSD gets it wrong:
> 
>     
> http://permalink.gmane.org/gmane.comp.standards.posix.austin.general/7923

Thanks.

> Rob

Felix



More information about the Toybox mailing list