[Toybox] tar --null

enh enh at google.com
Mon Jul 18 17:02:41 PDT 2022


On Mon, Jul 18, 2022 at 4:55 PM enh <enh at google.com> wrote:

>
>
> On Mon, Jul 18, 2022 at 9:02 AM Rob Landley <rob at landley.net> wrote:
>
>> On 7/15/22 21:19, enh wrote:
>> > On Fri, Jul 15, 2022 at 9:34 AM Rob Landley <rob at landley.net
>> > <mailto:rob at landley.net>> wrote:
>> >
>> >     On 7/14/22 18:53, enh wrote:
>> >     > On Wed, Jul 13, 2022 at 11:58 PM Rob Landley <rob at landley.net
>> >     <mailto:rob at landley.net>
>> >     > <mailto:rob at landley.net <mailto:rob at landley.net>>> wrote:
>> >     >
>> >     >     On 7/12/22 19:13, enh via Toybox wrote:
>> >     >     > so.. --transform works (though it confused people that it's
>> not in
>> >     the --help
>> ...
>> >     Yeah but August 6 is 3 months from the previous release and I'd
>> like to do that
>> >     on a more regular schedule (modulo maybe slipping a bit to sync up
>> with kernel
>> >     releases for mkroot), meaning I want to finish this properly
>> soonish. :)
>> >
>> >     I have a half dozen open cans of worms right now... dd, sh, mkroot
>> walkthrough,
>> >     diff, tar --transform, a redo of lib/passwd.c and everything
>> depending on it,
>> >     and in file.c:
>> >
>> >     + * TODO: XZ, JPEG size, dpkg.deb, rpm, mp3, odt, mp4, iso
>> >     + * MBR boot sector (partition X: startsector %d, %d sectors;)
>> >     + * word (.docx: Word 2007+), excel
>> >
>> > you shouldn't do those yourself --- you should make each of those a
>> separate bug
>> > on github with a "help wanted" or "starter project" label, and then
>> next time
>> > you have someone asking "hey, is there something i can look at?", you
>> have stuff
>> > ready and waiting...
>>
>> Good suggestion, but I'm never sure what actually _is_ easy. I shelved
>> this
>> after doing about half of mp3 identification, which turns out to be a
>> surprisingly large rathole due to funky container formats. (And I don't
>> trust
>> anything microsoft's ever touched not to be turing-complete to solve...)
>>
>
> heh, i know exactly what you mean because (a) i have this problem all the
> time at work, where people don't finish their "starter project" for years
> and (b) your specific jpeg size example was one _i_ punted when i
> originally submitted jpeg support because it turned out to be non-obvious.
>
> i still think this is the "least worst" option though, and that's actually
> one reason why i suggested a separate bug for each: it lets people thrash
> about a bit until they find one that _is_ easy (for them).
>
>
>> > (not that you can 100% trust me not to do some of those when i've had a
>> week
>> > when i didn't get to write even a line of code and i'm looking for
>> something to
>> > do. but i'm trying to _stop_ doing all the easy little pieces myself at
>> work for
>> > similar reasons!)
>> >
>> >     Trying to close tabs for a release. :)
>>
>> And of course I symmetrically added -a to nsenter and unshare before
>> noticing
>> that debian only has -a in nsenter and not unshare. I also don't know why
>> nsenter has -S and -G but unshare doesn't? It seems like "create new
>> container"
>> and "insert process into existing container" are almost the same problem
>> space...?
>>
>
> (that seems reasonable to me unless proven otherwise.)
>
>
>> >     Stream forward until you hit a diff, and then accumulate lines from
>> each file
>> >     one at a time scanning BACKWARDS in the other file to find matching
>> lines (where
>> >     does new last line of file 2 match in the list-since-difference of
>> file1), and
>> >     when you find -U *2 lines of match you've ended the hunk. Flush
>> what you've seen
>> >     (keeping the usual three lines of starting context) and move
>> forward again as
>> >     matched. This usually leaves unconsumed lines in the other file
>> (sometimes ALL
>> >     of what we've loaded from one file is unconsumed, that happens when
>> you add or
>> >     remove a single line in isolation for example) but you just need to
>> feed those
>> >     back in as "new" lines to the search algorithm...
>> >
>> >     Yeah it's an N^2 search algorithm but what's the biggest hunk
>> you've ever seen,
>> >     200 lines? 1000? Modern hardware doing N^2 search over 1000 lines
>> isn't going to
>> >     break stride. The INPUT FILE size doesn't matter, except as a
>> theoretical bound
>> >     on the upper size of the hunk if you diff two completely unrelated
>> files, but
>> >     optimizing for that case seems silly?
>> >
>> > aye, though -- like you -- i assume that's the kind of pathological
>> case they
>> > were thinking of.
>>
>> A) so where's the test case?
>>
>
> tests?! did you ever look at any of the bell labs boys' stuff? :-)
>
>
>> B) McIlroy's paper was published in 1976. which is theoretically 30
>> iterations
>> of Moore's Law ago, implying we can literally handle a billion times as
>> much
>> corner case processing as they could.
>>
>> > (because although it never happens "for real", it happens
>> > interactively, and that's probably when people are most sensitive to
>> speed.) i
>> > don't remember seeing a single hunk more than tens of lines (except the
>> other
>> > pathological case of "new file").
>>
>> If people can send me a test case, I'm happy to fix it?
>>
>> In theory the improved search the paper described is just a subset of the
>> N^2
>> search that abandons attempts faster to find a non-optimal solution
>> quickly.
>> They're just doing it over the whole file instead of a current potential
>> hunk...
>>
>> >     >     > but in the meantime
>> >     >     > the kernel build script now uses --null with
>> >     >     >
>> >     >
>> >      -T:
>> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
>> >     <
>> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
>> >
>> >     >
>> >      <
>> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
>> >     <
>> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
>> >>
>> >     >     >
>> >     >
>> >      <
>> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
>> >     <
>> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
>> >
>> >     >
>> >      <
>> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
>> >     <
>> https://cs.android.com/android/kernel/superproject/+/common-android-mainline:build/kernel/build.sh;l=964
>> >>>
>>
>> I have no idea why your email system does this.
>>
>
> and sadly, it's not clever enough for me to say "plain text for mailing
> lists, html for everything else". (or even just "plain text for any thread
> i _start_, but respect whatever's already in use for any thread i _reply_
> to".)
>
>
>> >     But now that I've gone "well here's the 80/20 solution to handling
>> mode shifts",
>> >     I'm tempted to code that up instead. Lemme see if I get to it this
>> weekend, if
>> >     not I owe you this applied before monday.
>> >
>> > sgtm. i've been trying to stop committing things on fridays, so
>> monday's the
>> > earliest i'd be giving the kernel folks a new prebuilt anyway :-)
>>
>> Didn't get it done over the weekend. Reeducating myself on args plumbing
>> corner
>> cases instead...
>>
>
> ack. i tried to take an update but hit another -Werror=format-security
> issue with one of your diff.c changes. i've sent a patch (and a separate
> patch to add that -Werror= to the default toybox configure, since that's
> one we always have to fix in the end anyway; may as well catch them fresh?).
>
> i'll try again tomorrow... (i want to try to use `timeout -i` too!)
>

heh, this doesn't seem intentional (especially because it happens without
`-i`), but it wasn't obvious to me what the fix is?

~/toybox$ time ./toybox timeout -s SEGV 10 ./foo
timeout: exec ./foo: No such file or directory

real 0m10.035s
user 0m0.009s
sys 0m0.013s
~/toybox$


> Rob
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20220718/31cd0309/attachment-0001.htm>


More information about the Toybox mailing list