[Toybox] grep and empty regexes

enh enh at google.com
Wed Jul 24 16:25:16 PDT 2019


so here's two FAILs and one accidental PASS (because the test doesn't
actually check the return code)...

grep: bad REGEX '': empty (sub)expression
FAIL: grep -e blah -e ''
echo -ne "one one one\n" > input
echo -ne '' | grep -e blah -e '' input
--- expected 2019-07-24 14:21:52.872813591 -0230
+++ actual 2019-07-24 14:21:52.872813591 -0230
@@ -1 +0,0 @@
-one one one
grep: bad REGEX '': empty (sub)expression
PASS: grep -w ''
grep: bad REGEX '': empty (sub)expression
FAIL: grep -w '' 2
echo -ne "one  two\n" > input
echo -ne '' | grep -w '' input
--- expected 2019-07-24 14:21:52.982813591 -0230
+++ actual 2019-07-24 14:21:52.982813591 -0230
@@ -1 +0,0 @@
-one  two

POSIX says there's no such thing as an empty regular expression. (by
having a grammar that excludes the possibility:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html)

BSD agrees, and Android and macOS' regcomp() rejects the empty regular
expression.

GNU apparently disagrees (as i learned from your tests).

not sure what to do here, in particular because -- given your tests --
i don't think we can represent the GNU interpretation as a POSIX
regular expression?

...except i think there's a bug in the BSD implementation that does
allow '()'. seems to have been there for at least 26 years judging by
https://github.com/freebsd/freebsd/blame/master/lib/libc/regex/regcomp.c#L383
so i think it's probably safe to rely on that for the time being.
glibc's happy with it too.

patch attached. (i've said "BSD" rather than "POSIX" in the code
comment because BSD makes it clearer that this is a practical rather
than just theoretical concern.)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-grep-fake-GNU-behavior-for-non-POSIX-empty-regex.patch
Type: text/x-patch
Size: 2149 bytes
Desc: not available
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20190724/cf1b60e6/attachment-0002.bin>


More information about the Toybox mailing list