linux-kernel - Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ff1bce1a-62d8-1b2e-4560-1ce1ffc209bc@oracle.com>
Date:   Mon, 29 May 2023 14:35:18 -0500
From:   Mike Christie <michael.christie@...cle.com>
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux@...mhuis.info, nicolas.dichtel@...nd.com, axboe@...nel.dk,
        linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, mst@...hat.com,
        sgarzare@...hat.com, jasowang@...hat.com, stefanha@...hat.com,
        brauner@...nel.org
Subject: Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps
 regression

On 5/29/23 12:46 PM, Oleg Nesterov wrote:
> Mike, sorry, I don't understand your email.
> 
> Just in case, let me remind I know nothing about drivers/vhost/
> 

No problem. I get it. I don't know the signal/thread code so it's
one of those things where I'm also learning as I go.

> On 05/29, michael.christie@...cle.com wrote:
>>
>> On 5/29/23 6:19 AM, Oleg Nesterov wrote:
>>> On 05/27, Eric W. Biederman wrote:
>>>>
>>>> Looking forward I don't see not asking the worker threads to stop
>>>> for the coredump right now causing any problems in the future.
>>>> So I think we can use this to resolve the coredump issue I spotted.
>>>
>>> But we have almost the same problem with exec.
>>>
>>> Execing thread will wait for vhost_worker() while vhost_worker will wait for
>>> .release -> vhost_task_stop().
>>
>> For this type of case, what is the goal or correct behavior in the end?
>>
>> When get_signal returns true we can code things like you mention below
> 
> and you have mentioned in the next email that you have already coded something
> like this, so perhaps we can delay the further discussions until you send the
> new code?
> 

Ok. Let me post that. You guys and the vhost devs can argue about if it's
too ugly to merge :)

>> and
>> clean up the task_struct.
> 
> Hmm... If we you CLONE_THREAD the exiting vhost_worker() will auto-reap itself,

Oh wait, are you saying that when we get auto-reaped then we would do the last
fput and call the file_operations->release function right? We actually set
task_struct->files = NULL for the vhost_task task_struct, so I think we call
release a little sooner than you think.

vhost_task_create() sets kernel_clone_args->no_files, so the vhost_task task_struc
that gets created works like kthreads where it doesn't do a CLONE_FILES and it
doesn't do a dup_fd.

So when we do de_thread() -> zap_other_threads(), that will kill all the threads
in the group right? So when they exit, it will call our release function since
we don't have refcount on ourself.

> 
>> However, we now have a non-functioning vhost device
>> open and just sitting around taking up memory and it can't do any IO.
> 
> can't comment, see above.
> 
>> For this type of case, do we expect just not to crash/hang, or was this new
>> exec'd thread suppose to be able to use the vhost device?
> 
> I just tried to point out that (unless I missed something) there are more corner
> cases, not just coredump.

Ok. I see.

> 
>>> Or suppose that vhost_worker's sub-thread forks a child with CLONE_FILES...
>>
>> You mean the vhost_task's task/thread doing a function that does a copy_process
>> right?
> 
> I meant that the vhost_task's sub-thread can do sys_clone(CLONE_FILES) from
> userspace.

I think we were talking about different things. I was saying that when the vhost
layer does vhost_task_create() -> copy_process(), the kernel_clone_args.fn is
vhost_task_fn() -> vhost_worker(). I thought you were concerned about vhost_worker()
or some function it calls, calling copy_process() with CLONE_FILES.