linux-kernel - Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87bki2h6v1.fsf@email.froward.int.ebiederm.org>
Date:   Mon, 29 May 2023 21:48:34 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     michael.christie@...cle.com
Cc:     Oleg Nesterov <oleg@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux@...mhuis.info, nicolas.dichtel@...nd.com, axboe@...nel.dk,
        linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, mst@...hat.com,
        sgarzare@...hat.com, jasowang@...hat.com, stefanha@...hat.com,
        brauner@...nel.org
Subject: Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps
 regression

michael.christie@...cle.com writes:

> On 5/29/23 2:35 PM, Mike Christie wrote:
>>> Hmm... If we you CLONE_THREAD the exiting vhost_worker() will auto-reap itself,
>> Oh wait, are you saying that when we get auto-reaped then we would do the last
>> fput and call the file_operations->release function right? We actually set
>> task_struct->files = NULL for the vhost_task task_struct, so I think we call
>> release a little sooner than you think.
>> 
>> vhost_task_create() sets kernel_clone_args->no_files, so the vhost_task task_struc
>> that gets created works like kthreads where it doesn't do a CLONE_FILES and it
>> doesn't do a dup_fd.
>> 
>> So when we do de_thread() -> zap_other_threads(), that will kill all the threads
>> in the group right? So when they exit, it will call our release function since
>> we don't have refcount on ourself.
>> 
>
> Just to make sure I'm on the same page now.
>
> In the past thread when were discussing the patch below and you guys were saying
> that it doesn't really ignore SIGKILL because we will hit the
> SIGNAL_GROUP_EXIT/group_exec_task checks and the parent is going to exit, it was
> on purpose.
>
> Instead of a signal to tell me when do exit, I was using the parent exiting and doing
> the last fput on the vhost device's file which calls our release function. That then
> allowed the vhost code to use the vhost_task to handle the outstanding IOs and then
> do vhost_task_should_stop to tell the vhost_task to exit when the oustanding IO
> had completed.
>
> On the vhost side of things it's really nice, because all the shutdown paths work
> the same and we don't need rcu/locking in the main IO path to handle the vhost_task
> exiting while we are using it.

The code below does nothing for exec.

You really need to call get_signal to handle SIGSTOP/freeze/process exit.

A variant on my coredump proposal looks like it can handle exec as well.
It isn't pretty but it looks good enough for right now.

If you could test the patch I posted up thread I think that is something
imperfect that is good enough for now.

Eric