[Toybox] Bad idea regarding threading...

Elie De Brauwer eliedebrauwer at gmail.com
Mon Apr 30 00:25:37 PDT 2012


On Sun, Apr 29, 2012 at 2:57 AM, Rob Landley <rob at landley.net> wrote:
> It occurs to me that if I add a CONFIG_TOYBOX_THREADS with the new
> directory traversal infrastructure, things like cp -a and rm -r could be
> done in a multithreaded manner.
>
> I.E. create a thread pool equal to the number of processors, and then
> every time you encounter a directory hand off the callbacks to a thread
> out of the thread pool. Everything they're doing is openat() based on a
> filehandle stored in the node structure (or a filehandle pair with the
> second stored in the node's ->extra field), so you don't need to worry
> about the current directory changing in another thread...

Indeed, and I think that this could also be a very nice feature to
differentiate toybox from similar tools, to my knowledge there aren't
any 'userlands' availble which have inherent multithreading support
(typically because most stem from the time that it was easier to
purchase a human kidney than to purchase smp systems.
The only thing I want to add there is that in such a scenario I expect
the number of usable cores to be runtime configurable (e.g. through an
environment variable).

>
> I also note that bzip is trivially parallelizeable (the file is handled
> in 900k independent chunks), and that bunzip2 could be parallelized with
> heuristics finding  block start signatures and speculatively passing
> them off to threads (which then discard the results if they fail or the
> previous blocks don't line up to that starting point when it gets around
> to writing stuff out).  Commit 215 was actual a refactoring to help
> prepare for this...
>
> I can do something similar with gzip based on dictionary resets,
> although --rsyncable would help there.
>

gzip vs pigz alike  http://zlib.net/pigz/ and more specifically
http://zlib.net/pigz/pigz.pdf

> Anyway, the _point_ of all this is if I flip the config switch to enable
> thread support in toybox, it should _automatically_ take advantage of
> SMP, the way mksquashfs does. I was pondering adding a new cp -F flag
> and then went "no, that's stupid, that's like a "use the floating point
> coprocessor" flag. If you built it with support, just DO it...)
>
> Anyway, just musing aloud. I'm weird in that to _me_ multithreadded
> programming is simple because I cut my teeth on OS/2 twenty years ago,
> but I suspect I should finish the nonthreaded 1.0 version first before
> worrying about that...
>

The only addition I'd like to make is that if this a path we want to
follow I'd not wait too long with doing it. Because the more code
there is, the more difficulty we can expect in making it all
threadsafe (probably implying that we have some more mature tests in
place than we have now).

my 2 cents
-- 
Elie De Brauwer

 1335770737.0


More information about the Toybox mailing list