[Toybox] Confused by bash trap handler return value.

Tue Feb 11 15:41:35 PST 2025

On 2/11/25 08:34, Chet Ramey wrote:
> On 2/10/25 7:25 PM, Rob Landley wrote:
>> I don't understand how to reconcile the behavior of all three of these 
>> in debian's bash 5.2.15(1):
> 
> Trap actions have no "return value." POSIX says:
> 
> "The value of "$?" after the trap action completes shall be the value it
> had before the trap action was executed."

Ok, I can do that. Thanks.

The question is how to TEST it...

>> 3) The previous statement returns 1 and the trap handler returns 1, so 
>> the next statement sees zero...?
> 
> The AND-OR list returns 0, since the trap command succeeds, the kill
> command runs, and kill returns 0 if it sends at least one signal
> successfully. That is documented:
> 
> "kill returns true if at least one signal was successfully  sent,
>   or false if an error occurs or an invalid option is encountered."

/bin/kill from procps is returning 1 if an error occurred (because a 
"PID does not exist" error occurs), bash kill is returning 0 because at 
least one signal was successfully sent (despite an error having 
occurred). Their behavior differs because you can read the above either 
way. (You're reading the or as a short circuit operator, they're 
probably reading the "at least one" as meaning "kill -s USR1" should 
return error and "return false if an error occurs" as a mandate.)

I suppose I can implement a 0 return code for the kill builtin and a 1 
return code if it's called via the $PATH, but... ew?

My problem is for the test suite I want the command to send a signal but 
return nonzero, and it's sort of atomic. The signal should reliably be 
delivered by the linux kernel before the command returns (because the 
command made a syscall, signals are checked for at syscall return or 
when the scheduler runs, and the shell is wait()ing for the child 
process so won't advance until after the command sending a signal to its 
parent process exits... which is a syscall that checks for the signal 
upon return), so the shell should reliably notice the pending signal 
before running the next shell command and execute the trap handler at a 
specific location, giving me deterministic output. Thus the /bin/kill 
behavior is useful to me to create a test around (because 987654321 
should never be a valid PID, but it already sent the signal to $$ 
because it parses the command line arguments in order), and the kill 
builtin behavior is not useful to me.

Any sort of split up "kill; false" means the signal would happen before 
the false, and thus I can't set up the non-success return code to test 
after the signal handler. Doing a trap "" USR1; kill -s USR1; false; 
trap "thingy" USR1 to ignore the signal around a critical section A) 
isn't really testing the same thing, B) again stomps the return code 
(although maybe trap "thingy" USR1 POTATO would...?).

The next alternative to making a test for this is some sort of 
"backgrounding and sleep" nonsense, which tends to end badly on loaded 
systems (such as android's test servers: the amount of time a sleep 
actually TAKES varies hugely). Although I also need to test what "read" 
does when we get a signal...

I'm trying not to need to run additional binaries for my test suite, it 
makes testing stuff in cross compiled environments a lot more 
complicated. (Right now the tests are a series of shell scripts you run 
against provided command line utilities. A compiler existing on the 
system is not required, nor are any binaries shipped with the test suite 
other than small test data files for commands like file/blkid/tar/gzip 
to examine, none of them are expected to be runnable on the current 
architecture...) And yes I wrote an "expect" implementation in bash.

I can test it by hand on my dev box, but this is a regression test suite 
meant to run after an arbitrary build (and ideally compare results 
against debian and/or busybox host commands)...

Rob