linux-kernel - Re: [PATCH 06/10] exit: Implement kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87mtk6xegz.fsf@email.froward.int.ebiederm.org>
Date:   Sat, 08 Jan 2022 12:35:40 -0600
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Alexey Gladkov <legion@...nel.org>,
        Kyle Huey <me@...ehuey.com>, Oleg Nesterov <oleg@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: Re: [PATCH 06/10] exit: Implement kthread_exit

Al Viro <viro@...iv.linux.org.uk> writes:

> IMO the right way to handle that would be
> 	1) turn these two do_exit() into do_exit(0), to reduce
> confusion
> 	2) deal with all do_exit() in kthread payloads.  Your
> name for the primitive is fine, IMO.
> 	3) make that primitive pass the return value by way of
> a field in struct kthread, adjusting kthread_stop() accordingly
> and passing 0 to do_exit() in kthread_exit() itself.
>
> (2) is not as trivial as you seem to hope, though.  Your patches
> in drivers/staging/rt*/ had papered over the problem in there,
> but hadn't really solved it.
>
> thread_exit() should've been shot, all right, but it really ought
> to have been complete_and_exit() there.  The thing is, complete()
> + return does *not* guarantee that driver won't get unloaded before
> the thread terminates.  Possibly freeing its .code and leaving
> a thread to resume running in there as soon as it regains CPU.
>
> The point of complete_and_exit() is that it's noreturn *and* in
> core kernel.  So it can be safely used in a modular kthread,
> if paired with wait_for_completion() in or before module_exit.
> complete() + do_exit() (or complete + return as you've gotten
> there) doesn't give such guarantees at all.


I think we are mostly in agreement here.

There are kernel threads started by modules that do:
	complete(...);
        return 0;

That should be at a minimum calling complete_and_exit.  Possibly should
be restructured to use kthread_stop().

Some of those users of the now removed thread_exit() in staging are
among the offenders.

However thread_exit() was implemented as:
	#define thread_exit() complete_and_exit(NULL, 0)

Which does nothing with a completion, it was just a really funny way to
spell "do_exit(0)".

While I agree digging through all of the kernel threads and finding the
ones that should be calling complete_and_exit is a fine idea.  It is
a concern independent of these patches.

> I'm (re)crawling through that zoo right now, will post when
> I get more details.

Eric