[Toybox] Confused by bash trap handler return value.

Wed Feb 12 13:24:50 PST 2025

On 2/12/25 3:36 PM, Rob Landley wrote:
> On 2/12/25 09:27, Chet Ramey wrote:
>>>> "The value of "$?" after the trap action completes shall be the value it
>>>> had before the trap action was executed."
>>>
>>> Ok, I can do that. Thanks.
>>>
>>> The question is how to TEST it...
>>
>> Testing that trap actions preserve $? is not hard. This had better not
>> echo 1.
>>
>> trap 'echo WINCH ; false' WINCH
>> (exit 41)       # force $? to something neither 0 nor 1
>> kill -WINCH $$
> 
> At which point $? is now the return code of the kill command, overwriting 
> the 41.

It will be 0, unless this script was invoked with SIGWINCH ignored, in
which case all bets are off because the trap action never gets run.

> 
>> # add this if you're concerned that kill will finish before the SIGWINCH
>> # arrives, since it will cause multiple system calls
>> (exit 42)
> 
> If the trap handler was processed right after kill but before (exit) this 
> overwrites the $? from kill _or_ leaked by the signal handler.

That's what the comment means. Comment the subshell command out. Seriously,
trap handlers never set $? that survives beyond their execution. You should
be able to verify that with static code analysis.

> 
>> echo $?
> 
> So seeing "42" here doesn't prove the signal handler preserved the 0 from 
> kill.

The goal is to force the handler to run if it hasn't due to delayed signal
delivery. However you do that is ok. Probably (exit $?) would be better.
I just don't think you need it, and you don't need the subshell command.

> 
>> The trap won't get run until after the kill returns; trap actions are
>> not run asynchronously.
> 
> But the kill returns before the (exit 42) runs.

And as long as the signal is delivered by then, it should be fine. I
can't see any reason that it wouldn't be, since the kernel is posting
the signal to the same process and bash is single-threaded. This is the
theoeretical race condition I referred to.

In normal operation, the kill builtin calls kill(2), sends the signal,
the signal arrives, the trap signal handler notes the pending trap,
the kill builtin returns, the shell sets $?, and the trap action runs.

> I implemented a check for signals between each command, meaning my code 
> checks for pending signals before launching the subshell (and inserts a 
> synthetic "eval" on the sh_function call stack for each one). Are you 
> saying it should NOT do that?

No.

> 
> (Also, I left the signal handler blocked and have the return from the 
> "trap" unblock it, so it defers handling a second instance of the same 
> signal until the trap handler for the first signal returns. You can have 
> different signals interrupt each other, but only uniquely. 

I allow recursive trap invocations, with a warning for non-release
versions of the shell. I got bug reports (or feature requests) when I
did it your way (sorry, I forget the exact details).

Some scripts change the trap action in the trap handler, resend the signal
to themselves, and expect it to have effect immediately, even in a
complicated shell function run in a trap action.

>> I'm going to change it for POSIX mode, since that's compatible with what
>> other POSIX shells have implemented. Bash default mode will stay the same.
> 
> I plead the 5th on moving targets. 

We've talked about this before.

> (Doesn't help me here, I'd have to do a 
> bash version check before adding -p.

To what?

I think I'll stick with running
> $(which kill).)

You should probably run type instead of which.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet at case.edu    http://tiswww.cwru.edu/~chet/