[Toybox] grep corner cases
Felix Janda
felix.janda at posteo.de
Wed Aug 21 13:42:29 PDT 2013
Rob Landley wrote:
> On 08/20/2013 12:29:08 PM, Felix Janda wrote:
> > Rob Landley wrote:
> > > On 08/19/2013 02:26:55 PM, Felix Janda wrote:
> > > > Hi,
> > > >
> > > > I saw the comment in changeset 1017 on possible bugs in GNU grep.
> > > >
> > > > The failing tests are for me:
> > > >
> > > > testing "grep -vo" "grep -vo one input" "two\nthree\n"
> > > > "onetwoonethreeone\n" ""
> > > > testing "grep -Fx ''" "grep -Fx '' input" "one one one\n" "one one
> > > > one\n" ""
> > > > testing "grep -F -e blah -e ''" "grep -F -e blah -e '' input" "one
> > > > one one\n" \
> > > > "one one one\n" ""
> > > >
> > > > -o is a GNU extension making grep only output the matched parts of
> > > > each
> > > > matched line. So since -v inverts the set of all matched lines
> > grep
> > > > -vo
> > > > should not output anything.
> > >
> > > Does it invert the set of matched _lines_, or does it invert the
> > match
> > > criteria? I made it so that:
> >
> > -v
> > Select lines not matching any of the specified patterns. If the
> > -v option is not specified, selected lines shall be those that
> > match any of the specified patterns.
> >
> > Does sound to me like the former. This fits the line based nature of
> > many
> > of the POSIX tools. It however doesn't make grep -vo very useful.
>
> Posix does not have the -o option. The -o option is not line based.
> This is about the effect of other options on the -o option.
Since you still can't match things spanning lines -o doesn't seem to
make grep into a byte based tool.
What should
echo on | grep -vo a
output? If you say that the sense of matching is inverted shouldn't
the output be
o
n
on
or some permutation thereof? The current output of toybox is pretting
interesting...
> > > echo oneandtwoandthree | grep -ov
> >
> > Shouldn't it be
> >
> > echo oneandtwoandthree | grep -ov and
>
> Yes.
>
> > > would produce:
> > > one
> > > two
> > > three
> > >
> > > (I pondered onetwothree but that's not how -o without -v works...)
> > >
> > > The reason there are deviating test cases to consider is I'm not
> > taking
> > > "what gcc does" as an inherent definition of "the right thing to
> > do".
> >
> > But maybe it's a reason to spend some thought on the validity of the
> > test case and maybe do some testing against other implementations. For
> > example busybox grep also doesn't output anything.
>
> What other implementations? -o is a gnu/dammit extension.
Did you read the last paragraph you are quoting carefully?
Just FYI: obase doesn't contain grep.
> > > That implies that
> > >
> > > echo one | grep -F -e walrus -e ''
> > >
> > > Should match one, but with the gnu/dammit version it only does so
> > > _without_ the -F. Or with -F and just one argument...
> >
> > busybox grep also agrees with you.
>
> And it's inconsistent:
>
> $ echo hello | grep -F -e ''
> hello
> $ echo hello | grep -F -e 'one' -e ''
> $ echo hello | grep -e 'one' -e ''
> hello
>
> (That's pretty clearly a bug. If you're wondering why I'm not slavishly
> copying gnu/dammit behavior it's because they're not very good at this.
> They've just had a very large testing base reporting bugs for a very
> long time.)
I don't care specifically about the GNU version. With GNU grep lying in
/usr/bin it's just too convenient to run tests against it.
> > > > Combined with -x IMO
> > > > it should only match empty lines.
> > >
> > > I asked the Austin guys how -F and -x interact. It's not obvious to
> > me
> > > from reading it.
> >
> > Sounds like a good idea.
>
> They said -x nukes '' being a wildcard (which matches the non-F version
> of the gnu/dammit behavior), but BSD gets it wrong:
>
>
> http://permalink.gmane.org/gmane.comp.standards.posix.austin.general/7923
Thanks.
> Rob
Felix
More information about the Toybox
mailing list