linux-kernel - Re: [RFC PATCH 1/8] signal: Dequeue SIGKILL even if SIGNAL_GROUP_EXIT/group_exec

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7412912a-a470-bd3d-fb1c-54c094cc01ee@oracle.com>
Date:   Thu, 18 May 2023 10:27:12 -0500
From:   Mike Christie <michael.christie@...cle.com>
To:     Christian Brauner <brauner@...nel.org>
Cc:     oleg@...hat.com, linux@...mhuis.info, nicolas.dichtel@...nd.com,
        axboe@...nel.dk, ebiederm@...ssion.com,
        torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, mst@...hat.com,
        sgarzare@...hat.com, jasowang@...hat.com, stefanha@...hat.com
Subject: Re: [RFC PATCH 1/8] signal: Dequeue SIGKILL even if
 SIGNAL_GROUP_EXIT/group_exec_task is set

On 5/18/23 3:08 AM, Christian Brauner wrote:
> On Wed, May 17, 2023 at 07:09:13PM -0500, Mike Christie wrote:
>> This has us deqeue SIGKILL even if SIGNAL_GROUP_EXIT/group_exec_task is
>> set when we are dealing with PF_USER_WORKER tasks.
>>
>> When a vhost_task gets a SIGKILL, we could have outstanding IO in flight.
>> We can easily stop new work/IO from being queued to the vhost_task, but
>> for IO that's already been sent to something like the block layer we
>> need to wait for the response then process it. These type of IO
>> completions use the vhost_task to process the completion so we can't
>> exit immediately.
>>
>> We need to handle wait for then handle those completions from the
>> vhost_task, but when we have a SIGKLL pending, functions like
>> schedule() return immediately so we can't wait like normal. Functions
>> like vhost_worker() degrade to just a while(1); loop.
>>
>> This patch has get_signal drop down to the normal code path when
>> SIGNAL_GROUP_EXIT/group_exec_task is set so the caller can still detect
>> there is a SIGKILL but still perform some blocking cleanup.
>>
>> Note that in that chunk I'm now bypassing that does:
>>
>> sigdelset(&current->pending.signal, SIGKILL);
>>
>> we look to be ok, because in the places we set SIGNAL_GROUP_EXIT/
>> group_exec_task we are already doing that on the threads in the
>> group.
>>
>> Signed-off-by: Mike Christie <michael.christie@...cle.com>
>> ---
> 
> I think you just got confused by the original discussion that was split
> into two separate threads:
> 
> (1) The discussion based on your original proposal to adjust the signal
>     handling logic to accommodate vhost workers as they are right now.
>     That's where Oleg jumped in.
> (2) My request - which you did in this series - of rewriting vhost
>     workers to behave more like io_uring workers.
> 
> Both problems are orthogonal. The gist of my proposal is to avoid (1) by
> doing (2). So the only change that's needed is
> s/PF_IO_WORKER/PF_USER_WORKER/ which is pretty obvious as io_uring
> workers and vhost workers no almost fully collapse into the same
> concept.
> 
> So forget (1). If additional signal patches are needed as discussed in
> (1) then it must be because of a bug that would affect io_uring workers
> today.

I maybe didn't exactly misunderstand you. I did patch 1/8 to show issues I
hit when I'm doing 2-8. See my reply to Eric's question about what I'm
hitting and why the last part of the patch only did not work for me:

https://lore.kernel.org/lkml/20230518000920.191583-2-michael.christie@oracle.com/T/#mc6286d1a42c79761248ba55f1dd7a433379be6d1