[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mtk6xegz.fsf@email.froward.int.ebiederm.org>
Date: Sat, 08 Jan 2022 12:35:40 -0600
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Alexey Gladkov <legion@...nel.org>,
Kyle Huey <me@...ehuey.com>, Oleg Nesterov <oleg@...hat.com>,
Kees Cook <keescook@...omium.org>,
Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: Re: [PATCH 06/10] exit: Implement kthread_exit
Al Viro <viro@...iv.linux.org.uk> writes:
> IMO the right way to handle that would be
> 1) turn these two do_exit() into do_exit(0), to reduce
> confusion
> 2) deal with all do_exit() in kthread payloads. Your
> name for the primitive is fine, IMO.
> 3) make that primitive pass the return value by way of
> a field in struct kthread, adjusting kthread_stop() accordingly
> and passing 0 to do_exit() in kthread_exit() itself.
>
> (2) is not as trivial as you seem to hope, though. Your patches
> in drivers/staging/rt*/ had papered over the problem in there,
> but hadn't really solved it.
>
> thread_exit() should've been shot, all right, but it really ought
> to have been complete_and_exit() there. The thing is, complete()
> + return does *not* guarantee that driver won't get unloaded before
> the thread terminates. Possibly freeing its .code and leaving
> a thread to resume running in there as soon as it regains CPU.
>
> The point of complete_and_exit() is that it's noreturn *and* in
> core kernel. So it can be safely used in a modular kthread,
> if paired with wait_for_completion() in or before module_exit.
> complete() + do_exit() (or complete + return as you've gotten
> there) doesn't give such guarantees at all.
I think we are mostly in agreement here.
There are kernel threads started by modules that do:
complete(...);
return 0;
That should be at a minimum calling complete_and_exit. Possibly should
be restructured to use kthread_stop().
Some of those users of the now removed thread_exit() in staging are
among the offenders.
However thread_exit() was implemented as:
#define thread_exit() complete_and_exit(NULL, 0)
Which does nothing with a completion, it was just a really funny way to
spell "do_exit(0)".
While I agree digging through all of the kernel threads and finding the
ones that should be calling complete_and_exit is a fine idea. It is
a concern independent of these patches.
> I'm (re)crawling through that zoo right now, will post when
> I get more details.
Eric
Powered by blists - more mailing lists