<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Sep 28, 2022 at 10:20 PM Rob Landley <<a href="mailto:rob@landley.net">rob@landley.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 9/28/22 14:54, enh wrote:<br>
> heh... funny you should mention "." in the first position...<br>
> <br>
> $ echo "foo.jar" | old-toybox grep '\.jar'<br>
> foo.jar<br>
> $ echo "foo.jar" | new-toybox grep '\.jar'<br>
> $ <br>
<br>
Technically that's '\' in the first position, which is the problem.<br>
<br>
(That's both parsing wrong, and I had an optimization where it skips the first<br>
character checking fast pattern matches because obviously we already matched<br>
that to find the right bucket with the patterns for this character... EXCEPT if<br>
that first character is escaped then skipping one character of each side of the<br>
comparison will put the pattern traversal out of sync.)<br>
<br>
Meanwhile, a pattern '^$' should match ONLY empty lines, and the easiest way to<br>
handle that is just special case that one to the regex path. The fast path can<br>
still handle '^' and '$' which match every line: zero length pattern does a zero<br>
length match at each position, which -o has plumbing to filter out because *<br>
does that too since it's zero-or-more instances so a* is going to match for<br>
length zero at every place that does NOT have an 'a', ala:<br>
<br>
$ echo '' | grep 'z*' | wc -l<br>
1<br>
<br>
All '^' and '$' do is say the zero length match occurs just once at the start or<br>
end of every line, but the -o logic discards all zero length matches and the<br>
non-o logic just cares that there IS a match. UNLESS you're producing colored<br>
output, but the colored output mostly piggybacks on -o. There is a sort of<br>
terrible corner case though, which oddly enough makes this corner case visible!<br>
<br>
$ echo potato | grep --color=always '' | hd<br>
00000000 70 6f 74 61 74 6f 0a |potato.|<br>
00000007<br>
$ echo potato | toybox grep --color=always '' | hd<br>
00000000 1b 5b 6d 1b 5b 31 3b 33 31 6d 1b 5b 6d 70 6f 74 |.[m.[1;31m.[mpot|<br>
00000010 61 74 6f 0a |ato.|<br>
00000014<br>
$ echo potato | toybox grep --color=always '$' | hd<br>
00000000 1b 5b 6d 70 6f 74 61 74 6f 1b 5b 31 3b 33 31 6d |.[mpotato.[1;31m|<br>
00000010 1b 5b 6d 0a |.[m.|<br>
00000014<br>
<br>
Yeah, I have a todo item to try to optimize the color escape generation for<br>
toybox but last time I sat down and looked at it I just didn't have the spoons.<br>
<br>
> that a simplified case based on the real-world build break caused by<br>
...<br>
> merging zero jar files into one instead of all the jar files :-)<br>
<br>
Acknowledged. Try commit 193009855266?<br></blockquote><div><br></div><div>the updated build prebuilts pass all the presubmit tests at least :-)</div><div><br></div><div>i'll refrain from switching everyone over on a friday (that's already a saturday for some people) though, and will do that on (my) monday morning instead...</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Also, I checked in the start of the sed/tar xform protocol stuff. It's not<br>
handling flags yet, but it's passing the data so it _can_ do so. (I didn't want<br>
to send you a tar/sed pair that had to be upgraded in tandem and then changed<br>
AGAIN later. Hopefully this is the usable protocol version...)<br></blockquote><div><br></div><div>i'll wait for an "all clear" before i poke further :-)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Rob<br>
</blockquote></div></div>