[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad128a2f-df04-ed1b-f444-24c1754bf80c@oracle.com>
Date: Tue, 30 May 2023 22:30:37 -0500
From: Mike Christie <michael.christie@...cle.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Oleg Nesterov <oleg@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux@...mhuis.info, nicolas.dichtel@...nd.com, axboe@...nel.dk,
linux-kernel@...r.kernel.org,
virtualization@...ts.linux-foundation.org, mst@...hat.com,
sgarzare@...hat.com, jasowang@...hat.com, stefanha@...hat.com,
brauner@...nel.org
Subject: Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps
regression
On 5/29/23 9:38 PM, Eric W. Biederman wrote:
> diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
> index b7cbd66f889e..f3ce0fa6edd7 100644
> --- a/kernel/vhost_task.c
> +++ b/kernel/vhost_task.c
...
> static int vhost_task_fn(void *data)
> {
> struct vhost_task *vtsk = data;
> - int ret;
> + bool dead = false;
> +
> + for (;;) {
> + bool did_work;
> +
> + /* mb paired w/ kthread_stop */
> + set_current_state(TASK_INTERRUPTIBLE);
> +
> + if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)) {
> + __set_current_state(TASK_RUNNING);
> + break;
> + }
> +
> + if (!dead && signal_pending(current)) {
> + struct ksignal ksig;
> + /*
> + * Calling get_signal will block in SIGSTOP,
> + * or clear fatal_signal_pending, but remember
> + * what was set.
> + *
> + * This thread won't actually exit until all
> + * of the file descriptors are closed, and
> + * the release function is called.
> + */
> + dead = get_signal(&ksig);
Hey Eric, the patch works well. Thanks! There was just one small issue.
get_signal() does try_to_freeze() -> ... __might_sleep() which wants the
state to be TASK_RUNNING.
We just need the patch below on top of yours which I think also cleans up
some of the state setting weirdness with the code like where vhost.c calls
__set_current_state(TASK_RUNNING) for each work. It looks like that was
not needed for any reason like a work->fn changing the state and the old
code could have done:
node = llist_del_all(&worker->work_list);
if (!node) {
schedule();
continue;
} else {
__set_current_state(TASK_RUNNING);
}
So I think we can do the following on top of your patch:
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 221d1b6c1be5..074273020849 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -346,7 +346,6 @@ static bool vhost_worker(void *data)
smp_wmb();
llist_for_each_entry_safe(work, work_next, node, node) {
clear_bit(VHOST_WORK_QUEUED, &work->flags);
- __set_current_state(TASK_RUNNING);
kcov_remote_start_common(worker->kcov_handle);
work->fn(work);
kcov_remote_stop();
diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
index f3ce0fa6edd7..fead2ed29561 100644
--- a/kernel/vhost_task.c
+++ b/kernel/vhost_task.c
@@ -29,12 +29,8 @@ static int vhost_task_fn(void *data)
bool did_work;
/* mb paired w/ kthread_stop */
- set_current_state(TASK_INTERRUPTIBLE);
-
- if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)) {
- __set_current_state(TASK_RUNNING);
+ if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags))
break;
- }
if (!dead && signal_pending(current)) {
struct ksignal ksig;
@@ -53,8 +49,10 @@ static int vhost_task_fn(void *data)
}
did_work = vtsk->fn(vtsk->data);
- if (!did_work)
+ if (!did_work) {
+ set_current_state(TASK_INTERRUPTIBLE);
schedule();
+ }
}
complete(&vtsk->exited);
Powered by blists - more mailing lists