linux-kernel - Re: [RFC PATCH 1/8] signal: Dequeue SIGKILL even if SIGNAL_GROUP_EXIT/group_exec

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230518-ratgeber-erbeben-843e68b0d6ac@brauner>
Date:   Thu, 18 May 2023 19:07:55 +0200
From:   Christian Brauner <brauner@...nel.org>
To:     Mike Christie <michael.christie@...cle.com>
Cc:     oleg@...hat.com, linux@...mhuis.info, nicolas.dichtel@...nd.com,
        axboe@...nel.dk, ebiederm@...ssion.com,
        torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, mst@...hat.com,
        sgarzare@...hat.com, jasowang@...hat.com, stefanha@...hat.com
Subject: Re: [RFC PATCH 1/8] signal: Dequeue SIGKILL even if
 SIGNAL_GROUP_EXIT/group_exec_task is set

On Thu, May 18, 2023 at 10:27:12AM -0500, Mike Christie wrote:
> On 5/18/23 3:08 AM, Christian Brauner wrote:
> > On Wed, May 17, 2023 at 07:09:13PM -0500, Mike Christie wrote:
> >> This has us deqeue SIGKILL even if SIGNAL_GROUP_EXIT/group_exec_task is
> >> set when we are dealing with PF_USER_WORKER tasks.
> >>
> >> When a vhost_task gets a SIGKILL, we could have outstanding IO in flight.
> >> We can easily stop new work/IO from being queued to the vhost_task, but
> >> for IO that's already been sent to something like the block layer we
> >> need to wait for the response then process it. These type of IO
> >> completions use the vhost_task to process the completion so we can't
> >> exit immediately.
> >>
> >> We need to handle wait for then handle those completions from the
> >> vhost_task, but when we have a SIGKLL pending, functions like
> >> schedule() return immediately so we can't wait like normal. Functions
> >> like vhost_worker() degrade to just a while(1); loop.
> >>
> >> This patch has get_signal drop down to the normal code path when
> >> SIGNAL_GROUP_EXIT/group_exec_task is set so the caller can still detect
> >> there is a SIGKILL but still perform some blocking cleanup.
> >>
> >> Note that in that chunk I'm now bypassing that does:
> >>
> >> sigdelset(&current->pending.signal, SIGKILL);
> >>
> >> we look to be ok, because in the places we set SIGNAL_GROUP_EXIT/
> >> group_exec_task we are already doing that on the threads in the
> >> group.
> >>
> >> Signed-off-by: Mike Christie <michael.christie@...cle.com>
> >> ---
> > 
> > I think you just got confused by the original discussion that was split
> > into two separate threads:
> > 
> > (1) The discussion based on your original proposal to adjust the signal
> >     handling logic to accommodate vhost workers as they are right now.
> >     That's where Oleg jumped in.
> > (2) My request - which you did in this series - of rewriting vhost
> >     workers to behave more like io_uring workers.
> > 
> > Both problems are orthogonal. The gist of my proposal is to avoid (1) by
> > doing (2). So the only change that's needed is
> > s/PF_IO_WORKER/PF_USER_WORKER/ which is pretty obvious as io_uring
> > workers and vhost workers no almost fully collapse into the same
> > concept.
> > 
> > So forget (1). If additional signal patches are needed as discussed in
> > (1) then it must be because of a bug that would affect io_uring workers
> > today.
> 
> I maybe didn't exactly misunderstand you. I did patch 1/8 to show issues I
> hit when I'm doing 2-8. See my reply to Eric's question about what I'm
> hitting and why the last part of the patch only did not work for me:
> 
> https://lore.kernel.org/lkml/20230518000920.191583-2-michael.christie@oracle.com/T/#mc6286d1a42c79761248ba55f1dd7a433379be6d1

Yeah, but these are issues that exist with PF_IO_WORKER then too which
was sort of my point.