lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m18s69v8zb.fsf@fess.ebiederm.org>
Date:   Fri, 26 Mar 2021 17:35:52 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Jens Axboe <axboe@...nel.dk>
Cc:     io-uring@...r.kernel.org, torvalds@...ux-foundation.org,
        metze@...ba.org, oleg@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

Jens Axboe <axboe@...nel.dk> writes:

> On 3/26/21 4:23 PM, Eric W. Biederman wrote:
>> Jens Axboe <axboe@...nel.dk> writes:
>> 
>>> On 3/26/21 2:29 PM, Eric W. Biederman wrote:
>>>> Jens Axboe <axboe@...nel.dk> writes:
>>>>
>>>>> We go through various hoops to disallow signals for the IO threads, but
>>>>> there's really no reason why we cannot just allow them. The IO threads
>>>>> never return to userspace like a normal thread, and hence don't go through
>>>>> normal signal processing. Instead, just check for a pending signal as part
>>>>> of the work loop, and call get_signal() to handle it for us if anything
>>>>> is pending.
>>>>>
>>>>> With that, we can support receiving signals, including special ones like
>>>>> SIGSTOP.
>>>>>
>>>>> Signed-off-by: Jens Axboe <axboe@...nel.dk>
>>>>> ---
>>>>>  fs/io-wq.c    | 24 +++++++++++++++++-------
>>>>>  fs/io_uring.c | 12 ++++++++----
>>>>>  2 files changed, 25 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>>>> index b7c1fa932cb3..3e2f059a1737 100644
>>>>> --- a/fs/io-wq.c
>>>>> +++ b/fs/io-wq.c
>>>>> @@ -16,7 +16,6 @@
>>>>>  #include <linux/rculist_nulls.h>
>>>>>  #include <linux/cpu.h>
>>>>>  #include <linux/tracehook.h>
>>>>> -#include <linux/freezer.h>
>>>>>  
>>>>>  #include "../kernel/sched/sched.h"
>>>>>  #include "io-wq.h"
>>>>> @@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
>>>>>  		if (io_flush_signals())
>>>>>  			continue;
>>>>>  		ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
>>>>> -		if (try_to_freeze() || ret)
>>>>> +		if (signal_pending(current)) {
>>>>> +			struct ksignal ksig;
>>>>> +
>>>>> +			if (fatal_signal_pending(current))
>>>>> +				break;
>>>>> +			if (get_signal(&ksig))
>>>>> +				continue;
>>>>                         ^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> That is wrong.  You are promising to deliver a signal to signal
>>>> handler and them simply discarding it.  Perhaps:
>>>>
>>>> 			if (!get_signal(&ksig))
>>>>                         	continue;
>>>> 			WARN_ON(!sig_kernel_stop(ksig->sig));
>>>>                         break;
>>>
>>> Thanks, updated.
>> 
>> Gah.  Kill the WARN_ON.
>> 
>> I was thinking "WARN_ON(!sig_kernel_fatal(ksig->sig));"
>> The function sig_kernel_fatal does not exist.
>> 
>> Fatal is the state that is left when a signal is neither
>> ignored nor a stop signal, and does not have a handler.
>> 
>> The rest of the logic still works.
>
> I've just come to the same conclusion myself after testing it.
> Of the 3 cases, most of them can do the continue, but doesn't
> really matter with the way the loop is structured. Anyway, looks
> like this now:

This idiom in the code:
> +		if (signal_pending(current)) {
> +			struct ksignal ksig;
> +
> +			if (fatal_signal_pending(current))
> +				break;
> +			if (!get_signal(&ksig))
> +				continue;
>  }

Needs to be:
> +		if (signal_pending(current)) {
> +			struct ksignal ksig;
> +
> +			if (!get_signal(&ksig))
> +				continue;
> +			break;
>  }

Because any signal returned from get_signal is fatal in this case.
It might make sense to "WARN_ON(ksig->ka.sa.sa_handler != SIG_DFL)".
As the io workers don't handle that case.

It won't happen because you have everything blocked.

The extra fatal_signal_pending(current) logic is just confusing in this
case.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ