[Toybox] sed

scsijon scsijon at lamiaworks.com.au
Mon Nov 11 21:39:41 PST 2019


On 11/11/19 12:13, Rob Landley wrote:
> 
> 
> On 11/10/19 6:56 PM, scsijon wrote:
>> I know it's not exactly a toybox problem, but can someone please sort this out
>> for me.
>>
>> sed ':a;N;$!ba;s|>\s*<|><|g'
> 
> Could you please be a little more vague?
I could have except I considered either you or one of those listening 
could answer the question. I'm definately NOT a 'sed' person and it was 
asked on a so called chat board by someone that had the problem, but 
couldn't solve it's failing. I will pass on your answer when next on 
that board.

many thanks.

> 
> :a is a jump label
> N means read the next line of input and append it to this one.
> $! means match last line and then when it's NOT that do...
> b a means branch (unconditional jump) back to label a
> 
> So that part so far is going to read the whole input into a single line. (But
> with embedded \n so it's still the same data, but it's processing it as one thing.)
> 
> s is the normal s/// search an dreplaced, except it doesn't need to be / it can
> be any character. In this case they're using | as the separators, so s|||
> 
> The "look for this" part is >\s*< and \s is a gnu/dammit regex extension meaning
> "any run of whitespace" (space, tab, or newline). (It works with toybox in
> glibc, no idea about other libc regex engines.) The portable way would be to say
> [[:space:]] and the logic behind that is that it's a special range within [abc]
> character matches, ala [abc[:space:]def]
> 
> Then the "replace it with this" part is >< and the third part is "g" which means
> "global", I.E. "don't just do the first one, do all of them".
> 
> So, you're matching > followed by any amount of space (and * matches "zero or
> more repeats of" so it'll also match no spaces), followed by <, and it replaces
> them with >< (so when it matches no spaces it replaces it with itself.
> 
> It looks like this regex removes whitespace between HTML tags, including gluing
> lines together to do so.
> 
> Rob
> 



More information about the Toybox mailing list