[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220222054848.563321-1-vipinsh@google.com>
Date: Tue, 22 Feb 2022 05:48:48 +0000
From: Vipin Sharma <vipinsh@...gle.com>
To: pbonzini@...hat.com, seanjc@...gle.com
Cc: mkoutny@...e.com, tj@...nel.org, lizefan.x@...edance.com,
hannes@...xchg.org, dmatlack@...gle.com, jiangshanlai@...il.com,
kvm@...r.kernel.org, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org, Vipin Sharma <vipinsh@...gle.com>
Subject: [PATCH v4] KVM: Move VM's worker kthreads back to the original cgroup
before exiting.
VM worker kthreads can linger in the VM process's cgroup for sometime
after KVM terminates the VM process.
KVM terminates the worker kthreads by calling kthread_stop() which waits
on the 'exited' completion, triggered by exit_mm(), via mm_release(), in
do_exit() during the kthread's exit. However, these kthreads are
removed from the cgroup using the cgroup_exit() which happens after the
exit_mm(). Therefore, A VM process can terminate in between the
exit_mm() and cgroup_exit() calls, leaving only worker kthreads in the
cgroup.
Moving worker kthreads back to the original cgroup (kthreadd_task's
cgroup) makes sure that the cgroup is empty as soon as the main VM
process is terminated.
Signed-off-by: Vipin Sharma <vipinsh@...gle.com>
Suggested-by: Sean Christopherson <seanjc@...gle.com>
---
Thanks Sean, for the example on how to use the real_parent outside of the RCU
critical region. I wrote your name in Suggested-by, I hope you are fine with
it and this is the right tag/way to give you the credit.
v4:
- Read task's real_parent in the RCU critical section.
- Don't log error message from the cgroup_attach_task_all() API.
v3: https://lore.kernel.org/lkml/20220217061616.3303271-1-vipinsh@google.com/
- Use 'current->real_parent' (kthreadd_task) in the
cgroup_attach_task_all() call.
- Revert cgroup APIs changes in v2. Now, patch does not touch cgroup
APIs.
- Update commit and comment message
v2: https://lore.kernel.org/lkml/20211222225350.1912249-1-vipinsh@google.com/
- Use kthreadd_task in the cgroup API to avoid build issue.
v1: https://lore.kernel.org/lkml/20211214050708.4040200-1-vipinsh@google.com/
virt/kvm/kvm_main.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 83c57bcc6eb6..cdf1fa3c60ae 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5810,6 +5810,7 @@ static int kvm_vm_worker_thread(void *context)
* we have to locally copy anything that is needed beyond initialization
*/
struct kvm_vm_worker_thread_context *init_context = context;
+ struct task_struct *parent;
struct kvm *kvm = init_context->kvm;
kvm_vm_thread_fn_t thread_fn = init_context->thread_fn;
uintptr_t data = init_context->data;
@@ -5836,7 +5837,7 @@ static int kvm_vm_worker_thread(void *context)
init_context = NULL;
if (err)
- return err;
+ goto out;
/* Wait to be woken up by the spawner before proceeding. */
kthread_parkme();
@@ -5844,6 +5845,25 @@ static int kvm_vm_worker_thread(void *context)
if (!kthread_should_stop())
err = thread_fn(kvm, data);
+out:
+ /*
+ * Move kthread back to its original cgroup to prevent it lingering in
+ * the cgroup of the VM process, after the latter finishes its
+ * execution.
+ *
+ * kthread_stop() waits on the 'exited' completion condition which is
+ * set in exit_mm(), via mm_release(), in do_exit(). However, the
+ * kthread is removed from the cgroup in the cgroup_exit() which is
+ * called after the exit_mm(). This causes the kthread_stop() to return
+ * before the kthread actually quits the cgroup.
+ */
+ rcu_read_lock();
+ parent = rcu_dereference(current->real_parent);
+ get_task_struct(parent);
+ rcu_read_unlock();
+ cgroup_attach_task_all(parent, current);
+ put_task_struct(parent);
+
return err;
}
base-commit: 1bbc60d0c7e5728aced352e528ef936ebe2344c0
--
2.35.1.473.g83b2b277ed-goog
Powered by blists - more mailing lists