[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211222225350.1912249-1-vipinsh@google.com>
Date: Wed, 22 Dec 2021 22:53:50 +0000
From: Vipin Sharma <vipinsh@...gle.com>
To: pbonzini@...hat.com, seanjc@...gle.com, tj@...nel.org,
lizefan.x@...edance.com, hannes@...xchg.org
Cc: dmatlack@...gle.com, jiangshanlai@...il.com, kvm@...r.kernel.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
Vipin Sharma <vipinsh@...gle.com>
Subject: [PATCH v2] KVM: Move VM's worker kthreads back to the original
cgroups before exiting.
VM worker kthreads can linger in the VM process's cgroup for sometime
after KVM terminates the VM process.
KVM terminates the worker kthreads by calling kthread_stop() which waits
on the 'exited' completion, triggered by exit_mm(), via mm_release(),
during kthread's exit. However, these kthreads are removed from the
cgroup using cgroup_exit() call which happens after exit_mm(). A VM
process can terminate between the time window of exit_mm() to
cgroup_exit(), leaving only worker kthreads in the cgroup.
Moving worker kthreads back to the original cgroup (kthreadd_task's
cgroup) makes sure that cgroup is empty as soon as the main VM process
is terminated.
kthreadd_task is not an exported symbol which causes build errors if KVM
is built as a loadable module. Both users (kvm_main & vhost) of
cgroup_attach_task_all(), have the same issue, therefore, using
kthreadd_task as a default option is chosen when the API is called with
NULL argument.
Signed-off-by: Vipin Sharma <vipinsh@...gle.com>
---
v2:
- Use kthreadd_task in the cgroup API to avoid build issue.
v1: https://lore.kernel.org/lkml/20211214050708.4040200-1-vipinsh@google.com/
kernel/cgroup/cgroup-v1.c | 5 +++++
virt/kvm/kvm_main.c | 15 ++++++++++++++-
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 81c9e0685948..81d4b2f2acf0 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -51,6 +51,8 @@ bool cgroup1_ssid_disabled(int ssid)
* @from: attach to all cgroups of a given task
* @tsk: the task to be attached
*
+ * If @from is NULL then use kthreadd_task for finding the destination cgroups.
+ *
* Return: %0 on success or a negative errno code on failure
*/
int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
@@ -58,6 +60,9 @@ int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
struct cgroup_root *root;
int retval = 0;
+ if (!from)
+ from = kthreadd_task;
+
mutex_lock(&cgroup_mutex);
percpu_down_write(&cgroup_threadgroup_rwsem);
for_each_root(root) {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b0f7e6eb00ff..f7504578c374 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5785,7 +5785,7 @@ static int kvm_vm_worker_thread(void *context)
init_context = NULL;
if (err)
- return err;
+ goto out;
/* Wait to be woken up by the spawner before proceeding. */
kthread_parkme();
@@ -5793,6 +5793,19 @@ static int kvm_vm_worker_thread(void *context)
if (!kthread_should_stop())
err = thread_fn(kvm, data);
+out:
+ /*
+ * We need to move the kthread back to its original cgroups, so that it
+ * doesn't linger in the cgroups of the user process after the user
+ * process has already terminated.
+ *
+ * kthread_stop() waits on 'exited' completion condition which is set
+ * in exit_mm(), via mm_release(), in do_exit(). However, kthread
+ * is removed from cgroups in the cgroup_exit() which is called after
+ * exit_mm(). This causes lingering of kthreads in cgroups after main
+ * VM process has finished.
+ */
+ WARN_ON(cgroup_attach_task_all(NULL, current));
return err;
}
base-commit: 5e4e84f1124aa02643833b7ea40abd5a8e964388
--
2.34.1.307.g9b7440fafd-goog
Powered by blists - more mailing lists