[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ff1bce1a-62d8-1b2e-4560-1ce1ffc209bc@oracle.com>
Date: Mon, 29 May 2023 14:35:18 -0500
From: Mike Christie <michael.christie@...cle.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux@...mhuis.info, nicolas.dichtel@...nd.com, axboe@...nel.dk,
linux-kernel@...r.kernel.org,
virtualization@...ts.linux-foundation.org, mst@...hat.com,
sgarzare@...hat.com, jasowang@...hat.com, stefanha@...hat.com,
brauner@...nel.org
Subject: Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps
regression
On 5/29/23 12:46 PM, Oleg Nesterov wrote:
> Mike, sorry, I don't understand your email.
>
> Just in case, let me remind I know nothing about drivers/vhost/
>
No problem. I get it. I don't know the signal/thread code so it's
one of those things where I'm also learning as I go.
> On 05/29, michael.christie@...cle.com wrote:
>>
>> On 5/29/23 6:19 AM, Oleg Nesterov wrote:
>>> On 05/27, Eric W. Biederman wrote:
>>>>
>>>> Looking forward I don't see not asking the worker threads to stop
>>>> for the coredump right now causing any problems in the future.
>>>> So I think we can use this to resolve the coredump issue I spotted.
>>>
>>> But we have almost the same problem with exec.
>>>
>>> Execing thread will wait for vhost_worker() while vhost_worker will wait for
>>> .release -> vhost_task_stop().
>>
>> For this type of case, what is the goal or correct behavior in the end?
>>
>> When get_signal returns true we can code things like you mention below
>
> and you have mentioned in the next email that you have already coded something
> like this, so perhaps we can delay the further discussions until you send the
> new code?
>
Ok. Let me post that. You guys and the vhost devs can argue about if it's
too ugly to merge :)
>> and
>> clean up the task_struct.
>
> Hmm... If we you CLONE_THREAD the exiting vhost_worker() will auto-reap itself,
Oh wait, are you saying that when we get auto-reaped then we would do the last
fput and call the file_operations->release function right? We actually set
task_struct->files = NULL for the vhost_task task_struct, so I think we call
release a little sooner than you think.
vhost_task_create() sets kernel_clone_args->no_files, so the vhost_task task_struc
that gets created works like kthreads where it doesn't do a CLONE_FILES and it
doesn't do a dup_fd.
So when we do de_thread() -> zap_other_threads(), that will kill all the threads
in the group right? So when they exit, it will call our release function since
we don't have refcount on ourself.
>
>> However, we now have a non-functioning vhost device
>> open and just sitting around taking up memory and it can't do any IO.
>
> can't comment, see above.
>
>> For this type of case, do we expect just not to crash/hang, or was this new
>> exec'd thread suppose to be able to use the vhost device?
>
> I just tried to point out that (unless I missed something) there are more corner
> cases, not just coredump.
Ok. I see.
>
>>> Or suppose that vhost_worker's sub-thread forks a child with CLONE_FILES...
>>
>> You mean the vhost_task's task/thread doing a function that does a copy_process
>> right?
>
> I meant that the vhost_task's sub-thread can do sys_clone(CLONE_FILES) from
> userspace.
I think we were talking about different things. I was saying that when the vhost
layer does vhost_task_create() -> copy_process(), the kernel_clone_args.fn is
vhost_task_fn() -> vhost_worker(). I thought you were concerned about vhost_worker()
or some function it calls, calling copy_process() with CLONE_FILES.
Powered by blists - more mailing lists