[Toybox] [PATCH] taskset: fix buffer overflow from long mask

Rob Landley rob at landley.net
Mon Jun 23 13:22:05 PDT 2025


On 6/23/25 10:38, Jesse Rosenstock wrote:
> Previously, a long mask on the command line would overrun toybuf.

Good catch.

tl;dr: commit 105a72fd53c2

> Use sizeof(toybuf) rather than 4096 when calling sched_getaffinity;
> maybe toybuf's size will change.

Unlikely. We've had that discussion here before, for example the thread 
starting here:

http://lists.landley.net/pipermail/toybox-landley.net/2023-April/029522.html

When I defined toybuf 4k was the common page size on linux at the time 
(everywhere except DEC Alpha), but the REASON everybody except arm uses 
4k pages is it's a good reasonable scratch pad size, which is _why_ it's 
a good size for toybuf. When a command wants a larger buffer, it mallocs 
one. Heck, filesystems stuck with 512 byte blocks for decades until 
moving _to_ 4k last decade. (There's diminishing returns for efficiency 
with larger granularity, and the amount of tail waste increases.)

Since then, ARM used larger pages sizes as a hack to access more 
physical memory without adding another page table level. And then tried 
very hard to convince people it had some sort of OTHER benefit to 
justify breaking everybody's binaries, even though that's not why they 
did it. (x86 just added the extra page table level.)

To be honest, I'm pretty sure I used 4096 there so I wouldn't have to 
wrap the line at 80 chars. :)

> Tested:
> % toybox taskset $( yes f | head -n 8193 | tr -d '\n' ) true
> taskset: mask too long
> 
> The util-linux taskset handles masks longer than zsh can construct:
> taskset $( yes f | head -n 131000 | tr -d '\n' ) true

Before using a fixed 4k bitmap (32768 processors) for taskset I went 
down ANOTHER historical rathole, "how many processors can a Linux system 
actually have". I'm pretty sure I redid the research and actually wrote 
it down this time back when commit e2b17f5e0cd3 went in. Checking my 
blog back around then, we get:

https://landley.net/notes-2022.html#24-07-2022

Which said only IBM used more than 64 processors in a defconfig (some 
powerpc set NR_CPUS to 2048 and s390x could set it to 512). That symbol 
sets the size of the array and thus the maximum this kernel could 
handle, not the amount actually present on any given hardware instance 
(usually way fewer than that).

The theoretical ranges kconfig allows the config symbol to be set to go 
up to 8196 on x86-64 and 4096 on arm64 and sparc (and arc if you count 
that), probably for bragging rights. From what the NUMA guys said back 
in the day you're way way way better off switching to a 
beowulf-descended cluster architecture long before you get anywhere NEAR 
that. I remember people trying to make a Linux system page faulting 
across the network transparently look like NUMA access to userspace back 
before the dot-com crash, but they never merged anything upstream, 
presumably for good reason.

The last headache-sized clusterfsck program I had to deal with just had 
a bunch of cheap Linux systems that mmaped pages from NAS and used range 
locking to mediate access. If you DON'T have serious contention, that's 
fine. If you DO have serious contention, throwing money at the problem 
to use big SMP hits seriously diminishing returns in the low double 
digit number of processors. All the scalability work people were doing 
was trying to move that to the HIGH double digit number of processors.

So a taskset that can't do more than 32768 (4x more than even the 
largest theoretical system) sounded fine to me. Happy to hear from a 
domain expert if that's changed...

> --- a/toys/other/taskset.c      2019-06-12 19:36:37.000000000 +0200
> +++ b/toys/other/taskset.c      2025-06-23 14:47:26.000000000 +0200
> @@ -74,6 +74,7 @@
> 
>       memset(toybuf, 0, sizeof(toybuf));
>       k = strlen(s = *toys.optargs);
> +    if (k > 2*sizeof(toybuf)) error_exit("mask too long");
>       s += k;
>       for (j = 0; j<k; j++) {
>         unsigned long digit = *(--s) - '0';
> @@ -121,8 +122,9 @@
>     unsigned i, j, nproc = 0;
> 
>     // This can only detect 32768 processors. Call getaffinity and count bits.
> -  if (!toys.optflags && -1!=sched_getaffinity(getpid(), 4096, toybuf)) {
> -    for (i = 0; i<4096; i++)
> +  if (!toys.optflags
> +      && -1 != sched_getaffinity(getpid(), sizeof(toybuf), toybuf)) {
> +    for (i = 0; i<sizeof(toybuf); i++)
>         if (toybuf[i]) for (j=0; j<8; j++) if (toybuf[i]&(1<<j)) nproc++;
>     }

$ git am blah.eml
Applying: taskset: fix buffer overflow from long mask
error: corrupt patch at line 22

Hmmm...
$ patch -p1 -i blah.eml
checking file toys/other/taskset.c
Hunk #1 succeeded at 69 with fuzz 1 (offset -5 lines).
Hunk #2 FAILED at 122.
1 out of 2 hunks FAILED -- saving rejects to file toys/other/taskset.c.rej

Fuzz? Ah, your first hunk starts with a blank line instead of the 
comment added in commit 5afab26b9c98 three years ago.

I'll just manually add the one line test and credit you in the commit 
comment. And now I REALLY want to add a test for this, but there isn't a 
tests/taskset.test file yet.

Hmmm, doing taskset tests is a bit of a design question innit? I can use 
  nproc to make a mask (taskset doesn't work on mac/bsd anyway so I 
don't need the sysctl fallback from scripts/pending.sh) but do we start 
with all processors available? (Is it a failure NOT to? In which case 
running "taskset 1 make tests" would fail, which would be new.)

Anyway, it can start with:

MASK="$(($(nproc)+0))"
if [ MASK -lt 2 ]
then
   echo "$SHOWSKIP: taskset (not SMP)"
   exit
fi
MASK="$(printf %x $(((1<<$MASK)-1)))"

Sigh, taskset -p is terrible, you can tell this was invented by gnu and 
not by real unix guys. I want to add a -P that produces actual usable 
output rather than prefixing it with garbage, but then TEST_HOST would 
fail unless I want to try to engage with the gnu/dammit crowd (who STILL 
haven't merged cut -DF nor turned it down either). I can run it through 
sed...

So the next obvious thing to do is "taskset 2 taskset", which doesn't 
work because running taskset with no arguments doesn't tell you the 
current process's mask (inherited from the parent process), you have to 
taskset -p and provide a pid. And I don't KNOW the PID of that second 
instance from the command line, it hasn't forked off when the command 
line is evaluated yet! That's ANOTHER thing I want to fix to be more 
unixy but of course incompatibility with gnu/dammit.

I don't want to use $$ because I don't want to make persistent changes 
to the parent shell, I'd rather use (taskset -p $BASHPID) to affect the 
subshell. Ooh, mksh _does_ support that, ok I can use it then.

NEXT PROBLEM: it's not "taskset -p $BASHPID $MASK". It's "taskset -p 
$MASK $BASHPID" which IS INSANE. And yes, that is THE order they have to 
go in:

$ taskset -p $BASHPID f
taskset: invalid PID argument: 'f'
$ taskset f -p $BASHPID
taskset: failed to execute -p: No such file or directory

I really hate gnu. They keep finding new ways to be obviously wrong.

And of course the first bash invocation outputs two lines of output I 
didn't ask for, because I specified -p to operate on. Behavior changing 
randomly: when I launch a new task it can be quiet, but when I modify an 
existing task it's chatty:

pid 3257's current affinity mask: f
pid 3257's new affinity mask: f

What a piece of... ahem.

Oh, and here's another fun one. If I tell a process to run on a specific 
processor and then try to ask ps -o cpu= what processor it's running on, 
toybox ps tells me but gnu ps always says "-". So ANOTHER test that 
TEST_HOST can't pass!

And no it doesn't have to be actually running to give useful results:

$ taskset 8 sleep 2 & sleep 1; toybox ps -o cpu $!
CPU
   3

8 is 1<<3 which is correct, my 4x laptop has 0 through 3 with mask 1 
specifying CPU 0, mask 2 specifying CPU 1, mask 4 specifying CPU 2, and 
mask 8 specifying CPU 3. The sleep 1 ensures that the sleep 2 is NOT 
running, but the task got pulled into CPU 3's scheduler queue and has 
stayed there. It's is CPU 3's responsibility, and when becoming runnable 
that's where it would get its next timeslice. That's what I am asking. 
And which debian's ps won't tell me.

Sigh, is debian's ps being broken really gnu's fault? Who did this...

$ dpkg-query -S $(which ps)
procps: /bin/ps
$ aptitude show procps | grep Homepage
Homepage: https://gitlab.com/procps-ng/procps

Their mailing list is on "freelists.org". Well, it smells gnu-adjacent 
anyway, not digging further because it won't help. Can I grab the raw 
data out of /proc...

$ taskset 8 sleep 5 & cat /proc/$!/stat | toybox cut -DF 39
3

I can, but cut -DF still is not available in TEST_HOST, and the awk in 
pending is 4500 lines to review.

There's a reason I haven't done tests for this before now. Maybe I could...

$ gimme() { echo ${39};}; gimme $(cat /proc/self/stat)
1

Ok, tests added. Now, looking back at YOUR test... the host doesn't 
error on long input, it silently trims the provided mask to the number 
of available processors. Which sounds like the right thing to do here.

Hmmm... taskset 0 is really testing the kernel, not toybox? Still, for 
completeness...

Rob


More information about the Toybox mailing list