[Toybox] Bad idea regarding threading...

Rob Landley rob at landley.net
Sat Apr 28 17:57:05 PDT 2012


It occurs to me that if I add a CONFIG_TOYBOX_THREADS with the new
directory traversal infrastructure, things like cp -a and rm -r could be
done in a multithreaded manner.

I.E. create a thread pool equal to the number of processors, and then
every time you encounter a directory hand off the callbacks to a thread
out of the thread pool. Everything they're doing is openat() based on a
filehandle stored in the node structure (or a filehandle pair with the
second stored in the node's ->extra field), so you don't need to worry
about the current directory changing in another thread...

I also note that bzip is trivially parallelizeable (the file is handled
in 900k independent chunks), and that bunzip2 could be parallelized with
heuristics finding  block start signatures and speculatively passing
them off to threads (which then discard the results if they fail or the
previous blocks don't line up to that starting point when it gets around
to writing stuff out).  Commit 215 was actual a refactoring to help
prepare for this...

I can do something similar with gzip based on dictionary resets,
although --rsyncable would help there.

Anyway, the _point_ of all this is if I flip the config switch to enable
thread support in toybox, it should _automatically_ take advantage of
SMP, the way mksquashfs does. I was pondering adding a new cp -F flag
and then went "no, that's stupid, that's like a "use the floating point
coprocessor" flag. If you built it with support, just DO it...)

Anyway, just musing aloud. I'm weird in that to _me_ multithreadded
programming is simple because I cut my teeth on OS/2 twenty years ago,
but I suspect I should finish the nonthreaded 1.0 version first before
worrying about that...

Rob
-- 
GNU/Linux isn't: Linux=GPLv2, GNU=GPLv3+, they can't share code.
Either it's "mere aggregation", or a license violation.  Pick one.

 1335661025.0


More information about the Toybox mailing list