[Toybox] grep corner cases
Rob Landley
rob at landley.net
Tue Aug 20 14:32:15 PDT 2013
On 08/20/2013 12:29:08 PM, Felix Janda wrote:
> Rob Landley wrote:
> > On 08/19/2013 02:26:55 PM, Felix Janda wrote:
> > > Hi,
> > >
> > > I saw the comment in changeset 1017 on possible bugs in GNU grep.
> > >
> > > The failing tests are for me:
> > >
> > > testing "grep -vo" "grep -vo one input" "two\nthree\n"
> > > "onetwoonethreeone\n" ""
> > > testing "grep -Fx ''" "grep -Fx '' input" "one one one\n" "one one
> > > one\n" ""
> > > testing "grep -F -e blah -e ''" "grep -F -e blah -e '' input" "one
> > > one one\n" \
> > > "one one one\n" ""
> > >
> > > -o is a GNU extension making grep only output the matched parts of
> > > each
> > > matched line. So since -v inverts the set of all matched lines
> grep
> > > -vo
> > > should not output anything.
> >
> > Does it invert the set of matched _lines_, or does it invert the
> match
> > criteria? I made it so that:
>
> -v
> Select lines not matching any of the specified patterns. If the
> -v option is not specified, selected lines shall be those that
> match any of the specified patterns.
>
> Does sound to me like the former. This fits the line based nature of
> many
> of the POSIX tools. It however doesn't make grep -vo very useful.
Posix does not have the -o option. The -o option is not line based.
This is about the effect of other options on the -o option.
> > echo oneandtwoandthree | grep -ov
>
> Shouldn't it be
>
> echo oneandtwoandthree | grep -ov and
Yes.
> > would produce:
> > one
> > two
> > three
> >
> > (I pondered onetwothree but that's not how -o without -v works...)
> >
> > The reason there are deviating test cases to consider is I'm not
> taking
> > "what gcc does" as an inherent definition of "the right thing to
> do".
>
> But maybe it's a reason to spend some thought on the validity of the
> test case and maybe do some testing against other implementations. For
> example busybox grep also doesn't output anything.
What other implementations? -o is a gnu/dammit extension.
> > That implies that
> >
> > echo one | grep -F -e walrus -e ''
> >
> > Should match one, but with the gnu/dammit version it only does so
> > _without_ the -F. Or with -F and just one argument...
>
> busybox grep also agrees with you.
And it's inconsistent:
$ echo hello | grep -F -e ''
hello
$ echo hello | grep -F -e 'one' -e ''
$ echo hello | grep -e 'one' -e ''
hello
(That's pretty clearly a bug. If you're wondering why I'm not slavishly
copying gnu/dammit behavior it's because they're not very good at this.
They've just had a very large testing base reporting bugs for a very
long time.)
> > > Combined with -x IMO
> > > it should only match empty lines.
> >
> > I asked the Austin guys how -F and -x interact. It's not obvious to
> me
> > from reading it.
>
> Sounds like a good idea.
They said -x nukes '' being a wildcard (which matches the non-F version
of the gnu/dammit behavior), but BSD gets it wrong:
http://permalink.gmane.org/gmane.comp.standards.posix.austin.general/7923
> Felix
>
Rob
More information about the Toybox
mailing list