[Toybox] new toy: taskset

Rob Landley rob at landley.net
Tue Jul 17 19:06:32 PDT 2012


On 07/15/2012 06:34 AM, Elie De Brauwer wrote:
> Hello all,
> 
> In attach you can find an initial version of 'taskset'. It allows
> setting the cpu affinity of a given PID (or all tasks related with a
> given PID/TID). cpu affinity should be entered in hex, when no affinity
> is given the affinity for the process or the group of tasks is displayed.

Ok. In busybox, but not even in LSB. Huh. "other" then...

> I tested it (a.o.) with a yes > /dev/null and a top which shows the cpu
> load per core, and then you can see yes jump around
> 
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x1 20941
> pid 20941's current affinity mask: ff
> pid 20941's new affinity mask: 1
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x2 20941
> pid 20941's current affinity mask: 1
> pid 20941's new affinity mask: 2
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x4 20941
> pid 20941's current affinity mask: 2
> pid 20941's new affinity mask: 4
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x8 20941
> pid 20941's current affinity mask: 4
> pid 20941's new affinity mask: 8
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x10 20941
> pid 20941's current affinity mask: 8
> pid 20941's new affinity mask: 10
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x20 20941
> pid 20941's current affinity mask: 10
> pid 20941's new affinity mask: 20
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x30 20941
> pid 20941's current affinity mask: 20
> pid 20941's new affinity mask: 30
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x40 20941
> pid 20941's current affinity mask: 30
> pid 20941's new affinity mask: 40
> edb at lapedb:~/edb-stuff/toybox/toybox$ sudo ./toybox taskset 0x80 20941
> pid 20941's current affinity mask: 40
> pid 20941's new affinity mask: 80

Cool.

> One comment, I added sched.h into the toys.h include, but I also needed
> a #define _GNU_SOURCE or I'd run into troubles because it failed to find
> the CPU_SET* macro's. So this was added in taskset.c but I'm not sure if
> this is the correct way to go.

What an utterly horrible interface, which the glibc guys once again broke.

<rant>Hurd is a part of the gnu project. Linux is not. This is a
linux-specific system call, labeling it with "gnu" is like labeling it
"property of Sun Microsystems". They had nothing to do with it.</rant>

Your code is fine, but I'm gonna open code the darn system call anyway,
because I'm interpreting Android's "no GPL in userspace" as "No GNU is
good GNU".

Luckily, Linux system calls are binary backwards compatible forever
(modulo Documentation/feature-removal-schedule.txt), so if I just work
out what the ABI actually _is_, with appropriate word size and endianness...

Let's see, pid, size_t, and a pointer to... what? The size_t is bytes of
the blob of data the pointer points to, the problem is endianness.
(Since I don't personally have access to any machine with > 8
processors, let alone a big endian one, I kinda have to get this right
up front... or run strace on the existing code to see what it's passing
to the system call, but let's hold that in reserve...)

Digging into bits/sched.h it's treating it as an array of unsigned
longs, which on a big endian machine would be kind of disturbing. Ok,
forget what the headers are doing, what is the _kernel_ expecting?

kernel/sched/core.c:

SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
                unsigned long __user *, user_mask_ptr)
{
        int ret;
        cpumask_var_t mask;

        if ((len * BITS_PER_BYTE) < nr_cpu_ids)
                return -EINVAL;
        if (len & (sizeof(unsigned long)-1))
                return -EINVAL;

Yeah, that matches. The size is in bytes, but the kernel insists it must
be a multiple of sizeof(unsigned long). The glibc loonies of course
hardware a size of 1024. sigh: if I'm going to hardware something, I
might as well use toybuf which is already there and starts zeroed.

So it looks like:

  long *blah = (long *)toybuf;
  blah[cpu/sizeof(long)] |= 1<<(cpu&(sizeof(long)-1));

But that's just NUTS on a big endian system, so lemme confirm that. Dig
through the system call entry point into the kernel, the code is doing
cpumask_and() which lives in include/linux/cpumask.h and is a wrapper
around linux/bitmap.h which says I should look at lib/bitmap.c which says:

 * The byte ordering of bitmaps is more natural on little
 * endian architectures.  See the big-endian headers
 * include/asm-ppc64/bitops.h and include/asm-s390/bitops.h
 * for the best explanations of this ordering.

Except there isn't an asm-ppc64 anymore, it got merged into arch/power
(I should send a patch for that), so...

arch/powerpc/include/asm/bitops.h:

 * The bitop functions are defined to work on unsigned longs, so for a
 * ppc64 system the bits end up numbered:
 *
|63..............0|127............64|191...........128|255...........196|
 * and on ppc32:
 *
|31.....0|63....31|95....64|127...96|159..128|191..160|223..192|255..224|
 *

And that's exactly what I wanted to know. So my guess at how to do it
was right, and it just plain IS a crazy layout on big endian.

*shrug* Ok.

Rob
-- 
GNU/Linux isn't: Linux=GPLv2, GNU=GPLv3+, they can't share code.
Either it's "mere aggregation", or a license violation.  Pick one.

 1342577192.0


More information about the Toybox mailing list