[Toybox] grep corner cases

Felix Janda felix.janda at posteo.de
Tue Aug 20 10:29:08 PDT 2013


Rob Landley wrote:
> On 08/19/2013 02:26:55 PM, Felix Janda wrote:
> > Hi,
> > 
> > I saw the comment in changeset 1017 on possible bugs in GNU grep.
> > 
> > The failing tests are for me:
> > 
> > testing "grep -vo" "grep -vo one input" "two\nthree\n"  
> > "onetwoonethreeone\n" ""
> > testing "grep -Fx ''" "grep -Fx '' input" "one one one\n" "one one  
> > one\n" ""
> > testing "grep -F -e blah -e ''" "grep -F -e blah -e '' input" "one  
> > one one\n" \
> >   "one one one\n" ""
> > 
> > -o is a GNU extension making grep only output the matched parts of  
> > each
> > matched line. So since -v inverts the set of all matched lines grep  
> > -vo
> > should not output anything.
> 
> Does it invert the set of matched _lines_, or does it invert the match  
> criteria? I made it so that:

     -v                                                                                                                 
     Select lines not matching any of the specified patterns. If the
     -v option is not specified, selected lines shall be those that
     match any of the specified patterns.                                                                 

Does sound to me like the former. This fits the line based nature of many
of the POSIX tools. It however doesn't make grep -vo very useful.

>    echo oneandtwoandthree | grep -ov

Shouldn't it be

    echo oneandtwoandthree | grep -ov and

> would produce:
>    one
>    two
>    three
> 
> (I pondered onetwothree but that's not how -o without -v works...)
>
> The reason there are deviating test cases to consider is I'm not taking  
> "what gcc does" as an inherent definition of "the right thing to do".

But maybe it's a reason to spend some thought on the validity of the
test case and maybe do some testing against other implementations. For
example busybox grep also doesn't output anything.

> > -F turns on fixed string matching so '' is no longer the empty regex
> > which matches everything, but the empty string.
> 
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html
> 
> -F  Match using fixed strings. Treat each pattern specified as a string
>      instead of a regular expression. If an input line contains any of  
> the
>      patterns as a contiguous sequence of bytes, the line shall be  
> matched.
>      A null string shall match every line.

Sorry, I somehow overread the last line. So I agree with you on the
second test case.

> That implies that
> 
>    echo one | grep -F -e walrus -e ''
> 
> Should match one, but with the gnu/dammit version it only does so  
> _without_ the -F. Or with -F and just one argument...

busybox grep also agrees with you.

> > Combined with -x IMO
> > it should only match empty lines.
> 
> I asked the Austin guys how -F and -x interact. It's not obvious to me  
> from reading it.

Sounds like a good idea.

Felix

 1377019748.0


More information about the Toybox mailing list