[Toybox] Issue with names_to_pid(), pidof, killall

Tue Dec 17 06:20:36 PST 2013

On 11/22/2013 12:26:14 AM, Ashwini Sharma wrote:
> Hi All,
> 
> Due to some issue with my email account, changing my email to
> ak.ashwini1981 at gmail.com

Finally catching up on my email myself. :)

> I came across a bug in pidof command. The issue is
> with _-o_ option. callback function returns _1_ for omit
> list, to names_to_pid().

Sigh. names_to_pid() is in pending, a command not in pending is calling  
it...

(My bad.)

> names_to_pid(), breaks the internal _for_ loop, at the same
> time it also breaks out of the _while(readdir())_ loop.
> Which is causing the issue, as other pids are not searched for.

Indeed, it should return 0 not 1.

> for (curname = names; *curname; curname++)
>       if (**curname == '/' ? !strcmp(cmd, *curname)
>           : !strcmp(basename(cmd), basename(*curname)))
>         if (callback(u, *curname)) break;
> -    if (*curname) break;
> }
> 
> Why break from _while_ loop, as there can be other processes in the  
> list
> waiting for actioin.
> removing this fixes the issue.

Returning nonzero says to abort the loop. We shouldn't return nonzero  
if we want additional callbacks callbacks.

> _names_to_pid()_ is also used by _killall_. killall also has issues,
> 
> 1. the exit status is always > 0, thats because of toys.exitval++
> in _main() function. Exit status need to be corrected.

Actually it's supposed to exit 1 if no processes were found, and 0 if  
it could kill at least one process. I'll have it zero out the exit code  
if a signal could successfully be delivered.

> 2. No error/warning message is displayed for a non-exiting process.
> as is one by GNU implementation. instead due to the _toys.exitval++_
> its always printing __ No such process __ without any process name.

Actually it prints ("bad %u", pid)?

   ret = kill(pid, TT.signum);
   if (ret == -1 && !(toys.optflags & FLAG_q))
     error_msg("bad %u", (unsigned)pid);

That looks like it should be perror_msg() so it reports the errno  
reason.

The "no such process" message was always printed because the exit value  
was never cleared.

No, the problem is if you kill things that don't exist in the ubuntu  
version:

   $ killall whackamole froglegs
   whackamole: no process found
   froglegs: no process found

You get an error message for each one. And this infrastructure isn't  
tracking that. Although to be honest, the upstream infrastructure is  
lying:

   $ killall init whackamole
   init(1): Operation not permitted
   init: no process found
   whackamole: no process found

And the errno behavior is slightly non-obvious:

   $ sleep 9999 &
   [1] 8016
   $ killall init sleep whackamole
   init(1): Operation not permitted
   init: no process found
   whackamole: no process found
   [1]+  Terminated              sleep 9999
   $ echo $?
   1

Rummage, rummage... Ok, if there was a command it couldn't kill  
(including "not found"), errno is 1. Order doesn't matter.

Ideally I'd save the errno of each entry so the message was right, but  
I don't want to mark up optargs[] (and if I did it'd screw up killing a  
second command with the same name), and can't conveniently use toybuf  
as a bitfield for two reasons (already used for a message right now,  
not necessarily long enough).

Speaking of not necessarily long enough:

   if (toys.optflags & FLAG_i) {
     sprintf(toybuf, "Signal %s(%d) ?", name, pid);

Bad because toybuf is 4k and name is environment data that could be  
128k, so somebody can trivially manipulate a heap overflow out of that.

Right, if I switch that to xsmprintf() I can use toybuf. I can save the  
start of the name array into the globals and then do "name-start" to  
get the offset into the array. If I use it as a bitfield I can  
reproduce the original (broken) behavior, and can track 32768 commands,  
which is probably more than anyone will ever use. (With 128k  
environment space they'd all have to be 3 character command names to  
_fill_ that, you'd need 2 chars to surpass it...)

If I use it as an array to store the errno... does it need to be short  
or char?

   $ cat arch/*/include/uapi/asm/errno* | awk '{print $3}' | sort -n
   ...
   255
   256
   257
   516
   1133

Sigh. Those last few aren't relevant (I grepped, they won't be produced  
by kill(2)), but from a complexity level I don't want to have to  
explain it in a giant comment, nor am I quite comfortable with 4096  
entries as a limit where behavior magically changes.

Screw it: malloc an array.

The next question is, how do I get the position in the array? I was  
thinking I could do pointer subtraction but that gives me how many  
bytes of string data were used. Hack up names_to_pid() to pass it?  
(Only currently two users...) Do a for loop checking?

Eh, I'll just do a for loop. Pathological case is less than 60k command  
line, and I don't have to compare strings (just pointers).

> In order not to modify the lib function _names_to_pid_ too much, these
> killall issues could be fixed from callback function _kill_process()_.

Modifying the callback is the correct approach, yes. The names_to_pid()  
function just iterates over the array we give it to find pids, there  
shouldn't be any logic about what use we put the results to.

> For printing the warning message for non existent processes, can have  
> a map
> for the processes which are signalled thru _kill_process_ and finally  
> once
> _names_to_pid_ returns, print the message for non-signalled processes.
> 
> Like to have your opinion.

I implemented the array version mentioned above before I actually read  
this far in the message. :)

I need to debug a bit before checking in and it's bus-to-work time, but  
I'll see if I can get this checked in tonight.

Rob