[Toybox] pathological case in sed s///g

Rich Felker dalias at libc.org
Mon May 6 10:48:31 PDT 2019


On Mon, May 06, 2019 at 12:42:44PM -0500, Rob Landley wrote:
> Huh... I'll assume REG_STARTEND works in bionic since you're pointing me at it.
> It's not in the regex man page but it _is_ in the glibc headers...
> 
>   https://github.com/bminor/glibc/commit/6fefb4e0b16
> 
> Looks like it went into glibc in 2004, which is way past 7 years. I should poke
> Michael Kerrisk to update the man page.
> 
> And musl explicitly refused to do it, but Rich makes bad calls all the time:
> 
>   https://www.openwall.com/lists/musl/2013/01/15/26
> 
> I still haven't got an #ifdef __MUSL__ in portability because Rich insists his
> libc is the only perfect piece of software ever written, but I can probe for it

You can #ifdef REG_STARTEND and use it conditionally depending on
whether the functionality is offered. There is no reason to hard-code
an assumption that musl doesn't or does have this functionality; it's
been proposed and there's even a patch somewhere. (It's not costly to
support relative to the current bad regex implementation, but the
concern is that it might impose a nontrivial cost on a future good
implementation once we're locked into having it.)

> If this _does_ match NUL bytes it lets me remove regexec0() entirely from libc,
> which would be very nice. And since it started life as a BSD extension (they're
> the only man page _documenting_ it) I get support on FreeBSD and probably MacOS too.

I'm not sure if the proposed patch supports matching NUL or not (i.e.
whether it treats the start/end as *additional* constraints, to work
with a substring of a necessarily null-terminated string, or whether
they replace the null-termination criterion. If we do adopt it we
should ensure we do whatever other implementations do here, especially
if that's important to use cases you or others want.

Rich


More information about the Toybox mailing list