lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Feb 2022 09:37:45 -0800
From:   Vipin Sharma <vipinsh@...gle.com>
To:     Michal Koutný <mkoutny@...e.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc:     Tejun Heo <tj@...nel.org>, seanjc@...gle.com,
        lizefan.x@...edance.com, hannes@...xchg.org, dmatlack@...gle.com,
        jiangshanlai@...il.com, kvm@...r.kernel.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] KVM: Move VM's worker kthreads back to the original
 cgroups before exiting.

Hi Paolo, Michal

Paolo:
Will you accept a patch which uses real_parent in
kvm_vm_worker_thread() as suggested by Sean, while I figure out the
recommendation from Michal about making kthread_stop() wait on
kernel_wait()?
        cgroup_attach_task_all(current->real_parent, current)

Michal:

On Thu, Jan 20, 2022 at 7:05 AM Michal Koutný <mkoutny@...e.com> wrote:
>
> On Wed, Jan 19, 2022 at 08:30:43AM -1000, Tejun Heo <tj@...nel.org> wrote:
> > It'd be nicer if we can make kthread_stop() waiting more regular but I
> > couldn't find a good existing place and routing the usual parent
> > signaling might be too complicated. Anyone has better ideas?
>
> The regular way is pictured in Paolo's diagram already, the
> exit_notify/do_signal_parent -> wait4 path.
>
> Actually, I can see that there exists already kernel_wait() and is used
> by a UMH wrapper kthread. kthreadd issues ignore_signals() so (besides
> no well defined point of signalling a kthread) the signal notification
> is moot and only waking up the waiter is relevant. So kthread_stop()
> could wait via kernel_wait() based on pid (extracted from task_struct).
>
> Have I missed an obstacle?
>

I must admit I do not have a good understanding of kernel_wait() and
kthread_stop() APIs. I tried making some changes in the kthread_stop()
but I was not able to successfully use the API. I tested it by a
writing a test module, where during the init I start a kthread which
prints some message every few seconds and during the module exit I
call kernel_stop(). This module worked as intended without the
kernel_wait() changes in the kthread_stop() API.

My changes were basically replacing wait_for_completion() with
kernel_wait() call.

@@ -645,8 +645,9 @@ int kthread_stop(struct task_struct *k)
        set_bit(KTHREAD_SHOULD_STOP, &kthread->flags);
        kthread_unpark(k);
        wake_up_process(k);
-       wait_for_completion(&kthread->exited);
-       ret = k->exit_code;
+       kernel_wait(k->pid, &ret);
+//     kernel_wait(task_pid_vnr(k), &ret);
+//     wait_for_completion(&kthread->exited);
+//     ret = k->exit_code;
        put_task_struct(k);

I used few other combination where I put kernel_wait() call after
put_task_struct(k) call.

Every time during the module exit, kernel was crashing like:

[  285.014612] RIP: 0010:0xffffffffc04ed074
[  285.018537] RSP: 0018:ffff9ccdc8365ee8 EFLAGS: 00010246
[  285.023761] RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000000001
[  285.030896] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffff9cce3f7d9cc0
[  285.038028] RBP: ffff9ccdc8365ef8 R08: 0000000000000000 R09: 0000000000015504
[  285.045160] R10: 000000000000004b R11: ffffffff8dd92880 R12: 0000000000000012
[  285.052293] R13: ffff9ccdc813db90 R14: ffff9ccdc7e66240 R15: ffffffffc04ed000
[  285.059425] FS:  0000000000000000(0000) GS:ffff9cce3f7c0000(0000)
knlGS:0000000000000000
[  285.067510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  285.073258] CR2: ffffffffc04ed074 CR3: 000000c07f20e002 CR4: 0000000000362ef0
[  285.080390] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  285.087522] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  285.094656] Call Trace:
[  285.097112]  kthread+0x148/0x1b0
[  285.100343]  ? kthread_blkcg+0x30/0x30
[  285.104096]  ret_from_fork+0x3a/0x60
[  285.107671] Code:  Bad RIP value.
[  285.107671] IP: 0xffffffffc04ecff4:

Crash is not observed if I keep wait_for_completion(&kthread->exited)
along with kernel_wait(), but I guess the kernel_wait() should be
sufficient by itself if I figure out proper way to use it.

Do you have any suggestions what might be the right way to use this API?

Thanks
Vipin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ