[Toybox] tar --null

enh enh at google.com
Mon Jul 25 12:58:32 PDT 2022


On Tue, Jul 19, 2022 at 6:34 AM Rob Landley <rob at landley.net> wrote:

> On 7/18/22 18:55, enh wrote:> On Mon, Jul 18, 2022 at 9:02 AM Rob Landley
> <rob at landley.net
> >     >     and in file.c:
> >     >
> >     >     + * TODO: XZ, JPEG size, dpkg.deb, rpm, mp3, odt, mp4, iso
> >     >     + * MBR boot sector (partition X: startsector %d, %d sectors;)
> >     >     + * word (.docx: Word 2007+), excel
> >     >
> >     > you shouldn't do those yourself --- you should make each of those a
> >     separate bug
> >     > on github with a "help wanted" or "starter project" label, and
> then next time
> >     > you have someone asking "hey, is there something i can look at?",
> you have
> >     stuff
> >     > ready and waiting...
> >
> >     Good suggestion, but I'm never sure what actually _is_ easy. I
> shelved this
> >     after doing about half of mp3 identification, which turns out to be a
> >     surprisingly large rathole due to funky container formats. (And I
> don't trust
> >     anything microsoft's ever touched not to be turing-complete to
> solve...)
> >
> > heh, i know exactly what you mean because (a) i have this problem all
> the time
> > at work, where people don't finish their "starter project" for years and
> (b)
> > your specific jpeg size example was one _i_ punted when i originally
> submitted
> > jpeg support because it turned out to be non-obvious.
>
> The problem with "leaving easy stuff for other people to do" is they don't
> do
> it. I submitted a series of updated patches to make the kernel's
> CONFIG_DEVTMPFS_MOUNT work for initramfs and not just the fallback root=
> mount,
> and nobody else ever picked that up and put it in.
>
> https://lkml.org/lkml/2017/9/13/651
>
> That seems quite easy, no? Here it is again 3 years later...
>
> https://lkml.iu.edu/hypermail/linux/kernel/2005.1/09399.html
>
> There's a type of salesmanship in getting Huck Finn's friends to paint his
> fence
> which is a completely different skillset from doing the work yourself.
> It's a
> quite useful skill I do not have.
>

yeah, but at the same time we do get people asking "is there something i
can do to help?", and having a pile of fences to paint eases that.


> > i still think this is the "least worst" option though, and that's
> actually one
> > reason why i suggested a separate bug for each: it lets people thrash
> about a
> > bit until they find one that _is_ easy (for them).
>
> People have been trying to get me to do more with bug trackers for a larger
> number of years than I like to think about.
>
> There's a whole lot of years of unmedicated ADHD tangled up in there. I
> use them
> when somebody ELSE manages them and regularly reviews what's been sitting
> there
> composting.


yeah, sorry for not having stayed on top of that after the last time you
mentioned it was an issue. i've been through just now and closed out the
stuff where we think it's fixed (but haven't heard back) or you've said
you're not going to do the thing (such as move README to markdown).

i _think_ everything that's left is actionable and still relevant. (and
it's back down to a single page again!)


> My self-managed workflow is make todo lists that work like new
> year's resolutions, then chase the shiny thing on a tangent from a tangent
> from
> a tangent until it's time to panic about externally imposed deadlines and
> Close
> All The Tabs.
>
> >     > (not that you can 100% trust me not to do some of those when i've
> had a week
> >     > when i didn't get to write even a line of code and i'm looking for
> >     something to
> >     > do. but i'm trying to _stop_ doing all the easy little pieces
> myself at
> >     work for
> >     > similar reasons!)
> >     >
> >     >     Trying to close tabs for a release. :)
> >
> >     And of course I symmetrically added -a to nsenter and unshare before
> noticing
> >     that debian only has -a in nsenter and not unshare. I also don't
> know why
> >     nsenter has -S and -G but unshare doesn't? It seems like "create new
> container"
> >     and "insert process into existing container" are almost the same
> problem
> >     space...?
> >
> > (that seems reasonable to me unless proven otherwise.)
>
> Yeah, but I should sync up with Denys periodically about whether busybox
> wants
> any of the new stuff. It's on the todo list...
>
> >     >     Yeah it's an N^2 search algorithm but what's the biggest hunk
> you've
> >     ever seen,
> >     >     200 lines? 1000? Modern hardware doing N^2 search over 1000
> lines
> >     isn't going to
> >     >     break stride. The INPUT FILE size doesn't matter, except as a
> >     theoretical bound
> >     >     on the upper size of the hunk if you diff two completely
> unrelated
> >     files, but
> >     >     optimizing for that case seems silly?
> >     >
> >     > aye, though -- like you -- i assume that's the kind of
> pathological case they
> >     > were thinking of.
> >
> >     A) so where's the test case?
> >
> > tests?! did you ever look at any of the bell labs boys' stuff? :-)
>
> Yes, quite a lot actually. (Computer history hobby!) There's a lot of
> survivorship bias in there with what got published and retained 50 years
> later.
> (All those old 1960s cars lasted so much longer than the stuff we have
> today. I
> know because every time I see a surviving 1960s car it's lasted until now.)
>
> But I'm also wondering about where the line's moved between a shared
> PDP-11's
> definition of "computationally hard" and modern hardware.
>
> And ALSO:
>
> $ seq 1 100000 > one; seq 1 4 100000 > two; time diff -u one two >
> /dev/null
> real    0m0.051s
> user    0m0.039s
> sys     0m0.012s
> $ seq 1 100000 > one; seq 1 4 100000 > two; time toybox diff -u one two >
> /dev/null
> real    0m0.320s
> user    0m0.148s
> sys     0m0.172s
>
> toys/pending/diff.c runs at 1/6 the speed of debian's. I'm not sure
> whatever
> optimization it THINKS it's doing is buying us anything?
>
> >      <
> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
> >     <
> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
> >>>>
> >
> >     I have no idea why your email system does this.
> >
> > and sadly, it's not clever enough for me to say "plain text for mailing
> lists,
> > html for everything else". (or even just "plain text for any thread i
> _start_,
> > but respect whatever's already in use for any thread i _reply_ to".)
>
> Still beats what gmail's doing...
>
> >     >     But now that I've gone "well here's the 80/20 solution to
> handling
> >     mode shifts",
> >     >     I'm tempted to code that up instead. Lemme see if I get to it
> this
> >     weekend, if
> >     >     not I owe you this applied before monday.
> >     >
> >     > sgtm. i've been trying to stop committing things on fridays, so
> monday's the
> >     > earliest i'd be giving the kernel folks a new prebuilt anyway :-)
> >
> >     Didn't get it done over the weekend. Reeducating myself on args
> plumbing corner
> >     cases instead...
> >
> > ack. i tried to take an update but hit another -Werror=format-security
> issue
>
> Sigh:
>
>   char *reset = 0;
>
>   if (stuff) {
>     reset = "\e[0m";
>   }
>   if (reset) printf(reset);
>
> The problem is if I'm testing with gcc's false positive generator and
> forget to
> test with llvm's false positive generator, it still may not catch all the
> same
> false positives.
>
> My objection to ASAN is I'm not yet convinced it ISN'T a false positive
> generator, although I should give it a closer look. (My first encounter
> with it
> being commit 472599b99bec is a contributing factor here.)
>
> > with one of your diff.c changes. i've sent a patch (and a separate patch
> to add
> > that -Werror= to the default toybox configure, since that's one we
> always have
> > to fix in the end anyway; may as well catch them fresh?).
>
> I agree I should hit the false positives before you hit the false
> positives.
>
> I need something like a ./testy.sh script that builds with the NDK (ASAN
> enabled) and runs the test suite... which involves getting the test suite
> to
> pass when built with the NDK. Working on it, I'll try to go faster and see
> if I
> can reshuffle the priorities a bit. I have been accused of trying to boil
> the
> ocean on more than one occasion...
>

yeah, though i think asan by default is a great idea, i think the NDK only
adds to your problems with no real benefit. my point with asan is that any
C programmer should probably just run with that on all the time. it's great
at catching memory errors early and it's pretty damn cheap on x86-64 ---
unless i'm benchmarking i run most stuff under asan most of the time. (and
toybox doesn't have a lot of Android-only code, so it's unlikely that we'd
get significantly better coverage from building with the NDK, and the
*platform* is always ahead of the NDK anyway, so "builds with the NDK"
isn't as strong a guarantee as you might think, even ignoring the fact that
if you _do_ build with the NDK, we don't actually build the
Android-specific stuff in exactly the same way.)


> As for fixing diff: sadly my cleanups so far have broken it in more than
> one way
> (there's the object lifetime thing and the logic to figure out what to
> actually
> compare when given different kinds of source/target pairs, although it
> wasn't
> entirely right before) and I stopped with yet another large cleanup
> half-finished in a directory going A) I need more tests, B) I'm gonna try
> to
> just write a SIMPLE one I understand and see how bad it is.
>
> Digging through this diff code has been a learning experience, but you
> guys are
> already using this meaning you need to go from something that works to
> something
> else that works...
>
> > i'll try again tomorrow... (i want to try to use `timeout -i` too!)
>
> I switched the printf() to xputsn(), and fixed up the off by one error
> causing
> the segfault. (Adding a quote increments the start, only decrement on
> return
> when we added that quote, otherwise it's both wrong and an unaligned
> pointer
> that's not to the start of an allocation.)
>
> That fixes the immediate issues, but I still do not currently consider
> diff.c to
> be load bearing. (Then again it wasn't really before, and probably isn't
> worse
> for your use cases so...)
>
> Rob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20220725/0952bfb0/attachment-0001.htm>


More information about the Toybox mailing list