<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jul 19, 2022 at 6:34 AM Rob Landley <<a href="mailto:rob@landley.net">rob@landley.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 7/18/22 18:55, enh wrote:> On Mon, Jul 18, 2022 at 9:02 AM Rob Landley<br>

<<a href="mailto:rob@landley.net" target="_blank">rob@landley.net</a><br>

>     >     and in file.c:<br>

>     ><br>

>     >     + * TODO: XZ, JPEG size, dpkg.deb, rpm, mp3, odt, mp4, iso<br>

>     >     + * MBR boot sector (partition X: startsector %d, %d sectors;)<br>

>     >     + * word (.docx: Word 2007+), excel<br>

>     ><br>

>     > you shouldn't do those yourself --- you should make each of those a<br>

>     separate bug<br>

>     > on github with a "help wanted" or "starter project" label, and then next time<br>

>     > you have someone asking "hey, is there something i can look at?", you have<br>

>     stuff<br>

>     > ready and waiting...<br>

> <br>

>     Good suggestion, but I'm never sure what actually _is_ easy. I shelved this<br>

>     after doing about half of mp3 identification, which turns out to be a<br>

>     surprisingly large rathole due to funky container formats. (And I don't trust<br>

>     anything microsoft's ever touched not to be turing-complete to solve...)<br>

><br>

> heh, i know exactly what you mean because (a) i have this problem all the time<br>

> at work, where people don't finish their "starter project" for years and (b)<br>

> your specific jpeg size example was one _i_ punted when i originally submitted<br>

> jpeg support because it turned out to be non-obvious.<br>

<br>

The problem with "leaving easy stuff for other people to do" is they don't do<br>

it. I submitted a series of updated patches to make the kernel's<br>

CONFIG_DEVTMPFS_MOUNT work for initramfs and not just the fallback root= mount,<br>

and nobody else ever picked that up and put it in.<br>

<br>

<a href="https://lkml.org/lkml/2017/9/13/651" rel="noreferrer" target="_blank">https://lkml.org/lkml/2017/9/13/651</a><br>

<br>

That seems quite easy, no? Here it is again 3 years later...<br>

<br>

<a href="https://lkml.iu.edu/hypermail/linux/kernel/2005.1/09399.html" rel="noreferrer" target="_blank">https://lkml.iu.edu/hypermail/linux/kernel/2005.1/09399.html</a><br>

<br>

There's a type of salesmanship in getting Huck Finn's friends to paint his fence<br>

which is a completely different skillset from doing the work yourself. It's a<br>

quite useful skill I do not have.<br></blockquote><div><br></div><div>yeah, but at the same time we do get people asking "is there something i can do to help?", and having a pile of fences to paint eases that.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> i still think this is the "least worst" option though, and that's actually one<br>

> reason why i suggested a separate bug for each: it lets people thrash about a<br>

> bit until they find one that _is_ easy (for them).<br>

<br>

People have been trying to get me to do more with bug trackers for a larger<br>

number of years than I like to think about.<br>

<br>

There's a whole lot of years of unmedicated ADHD tangled up in there. I use them<br>

when somebody ELSE manages them and regularly reviews what's been sitting there<br>

composting. </blockquote><div><br></div><div>yeah, sorry for not having stayed on top of that after the last time you mentioned it was an issue. i've been through just now and closed out the stuff where we think it's fixed (but haven't heard back) or you've said you're not going to do the thing (such as move README to markdown).</div><div><br></div><div>i _think_ everything that's left is actionable and still relevant. (and it's back down to a single page again!)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">My self-managed workflow is make todo lists that work like new<br>

year's resolutions, then chase the shiny thing on a tangent from a tangent from<br>

a tangent until it's time to panic about externally imposed deadlines and Close<br>

All The Tabs.<br>

<br>

>     > (not that you can 100% trust me not to do some of those when i've had a week<br>

>     > when i didn't get to write even a line of code and i'm looking for<br>

>     something to<br>

>     > do. but i'm trying to _stop_ doing all the easy little pieces myself at<br>

>     work for<br>

>     > similar reasons!)<br>

>     >  <br>

>     >     Trying to close tabs for a release. :)<br>

> <br>

>     And of course I symmetrically added -a to nsenter and unshare before noticing<br>

>     that debian only has -a in nsenter and not unshare. I also don't know why<br>

>     nsenter has -S and -G but unshare doesn't? It seems like "create new container"<br>

>     and "insert process into existing container" are almost the same problem<br>

>     space...?<br>

> <br>

> (that seems reasonable to me unless proven otherwise.)<br>

<br>

Yeah, but I should sync up with Denys periodically about whether busybox wants<br>

any of the new stuff. It's on the todo list...<br>

 <br>

>     >     Yeah it's an N^2 search algorithm but what's the biggest hunk you've<br>

>     ever seen,<br>

>     >     200 lines? 1000? Modern hardware doing N^2 search over 1000 lines<br>

>     isn't going to<br>

>     >     break stride. The INPUT FILE size doesn't matter, except as a<br>

>     theoretical bound<br>

>     >     on the upper size of the hunk if you diff two completely unrelated<br>

>     files, but<br>

>     >     optimizing for that case seems silly?<br>

>     ><br>

>     > aye, though -- like you -- i assume that's the kind of pathological case they<br>

>     > were thinking of.<br>

> <br>

>     A) so where's the test case?<br>

> <br>

> tests?! did you ever look at any of the bell labs boys' stuff? :-)<br>

<br>

Yes, quite a lot actually. (Computer history hobby!) There's a lot of<br>

survivorship bias in there with what got published and retained 50 years later.<br>

(All those old 1960s cars lasted so much longer than the stuff we have today. I<br>

know because every time I see a surviving 1960s car it's lasted until now.)<br>

<br>

But I'm also wondering about where the line's moved between a shared PDP-11's<br>

definition of "computationally hard" and modern hardware.<br>

<br>

And ALSO:<br>

<br>

$ seq 1 100000 > one; seq 1 4 100000 > two; time diff -u one two > /dev/null<br>

real    0m0.051s<br>

user    0m0.039s<br>

sys     0m0.012s<br>

$ seq 1 100000 > one; seq 1 4 100000 > two; time toybox diff -u one two > /dev/null<br>

real    0m0.320s<br>

user    0m0.148s<br>

sys     0m0.172s<br>

<br>

toys/pending/diff.c runs at 1/6 the speed of debian's. I'm not sure whatever<br>

optimization it THINKS it's doing is buying us anything?<br>

<br>

>      <<a href="https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964" rel="noreferrer" target="_blank">https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964</a><br>

>     <<a href="https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964" rel="noreferrer" target="_blank">https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964</a>>>>><br>

> <br>

>     I have no idea why your email system does this.<br>

> <br>

> and sadly, it's not clever enough for me to say "plain text for mailing lists,<br>

> html for everything else". (or even just "plain text for any thread i _start_,<br>

> but respect whatever's already in use for any thread i _reply_ to".)<br>

<br>

Still beats what gmail's doing...<br>

 <br>

>     >     But now that I've gone "well here's the 80/20 solution to handling<br>

>     mode shifts",<br>

>     >     I'm tempted to code that up instead. Lemme see if I get to it this<br>

>     weekend, if<br>

>     >     not I owe you this applied before monday.<br>

>     ><br>

>     > sgtm. i've been trying to stop committing things on fridays, so monday's the<br>

>     > earliest i'd be giving the kernel folks a new prebuilt anyway :-)<br>

> <br>

>     Didn't get it done over the weekend. Reeducating myself on args plumbing corner<br>

>     cases instead...<br>

>  <br>

> ack. i tried to take an update but hit another -Werror=format-security issue<br>

<br>

Sigh:<br>

<br>

  char *reset = 0;<br>

<br>

  if (stuff) {<br>

    reset = "\e[0m";<br>

  }<br>

  if (reset) printf(reset);<br>

<br>

The problem is if I'm testing with gcc's false positive generator and forget to<br>

test with llvm's false positive generator, it still may not catch all the same<br>

false positives.<br>

<br>

My objection to ASAN is I'm not yet convinced it ISN'T a false positive<br>

generator, although I should give it a closer look. (My first encounter with it<br>

being commit 472599b99bec is a contributing factor here.)<br>

<br>

> with one of your diff.c changes. i've sent a patch (and a separate patch to add<br>

> that -Werror= to the default toybox configure, since that's one we always have<br>

> to fix in the end anyway; may as well catch them fresh?).<br>

<br>

I agree I should hit the false positives before you hit the false positives.<br>

<br>

I need something like a ./testy.sh script that builds with the NDK (ASAN<br>

enabled) and runs the test suite... which involves getting the test suite to<br>

pass when built with the NDK. Working on it, I'll try to go faster and see if I<br>

can reshuffle the priorities a bit. I have been accused of trying to boil the<br>

ocean on more than one occasion...<br></blockquote><div><br></div><div>yeah, though i think asan by default is a great idea, i think the NDK only adds to your problems with no real benefit. my point with asan is that any C programmer should probably just run with that on all the time. it's great at catching memory errors early and it's pretty damn cheap on x86-64 --- unless i'm benchmarking i run most stuff under asan most of the time. (and toybox doesn't have a lot of Android-only code, so it's unlikely that we'd get significantly better coverage from building with the NDK, and the *platform* is always ahead of the NDK anyway, so "builds with the NDK" isn't as strong a guarantee as you might think, even ignoring the fact that if you _do_ build with the NDK, we don't actually build the Android-specific stuff in exactly the same way.)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

As for fixing diff: sadly my cleanups so far have broken it in more than one way<br>

(there's the object lifetime thing and the logic to figure out what to actually<br>

compare when given different kinds of source/target pairs, although it wasn't<br>

entirely right before) and I stopped with yet another large cleanup<br>

half-finished in a directory going A) I need more tests, B) I'm gonna try to<br>

just write a SIMPLE one I understand and see how bad it is.<br>

<br>

Digging through this diff code has been a learning experience, but you guys are<br>

already using this meaning you need to go from something that works to something<br>

else that works...<br>

<br>

> i'll try again tomorrow... (i want to try to use `timeout -i` too!)<br>

<br>

I switched the printf() to xputsn(), and fixed up the off by one error causing<br>

the segfault. (Adding a quote increments the start, only decrement on return<br>

when we added that quote, otherwise it's both wrong and an unaligned pointer<br>

that's not to the start of an allocation.)<br>

<br>

That fixes the immediate issues, but I still do not currently consider diff.c to<br>

be load bearing. (Then again it wasn't really before, and probably isn't worse<br>

for your use cases so...)<br>

<br>

Rob<br>

</blockquote></div></div>