linux-kernel - Re: [Intel-gfx] signal: break out of wait loops on kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d47b30e9-5619-c631-aa92-f5d89e88a909@linux.intel.com>
Date:   Thu, 20 Oct 2022 14:45:49 +0100
From:   Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>
To:     "Jason A. Donenfeld" <Jason@...c4.com>
Cc:     "Intel-gfx@...ts.freedesktop.org" <Intel-gfx@...ts.freedesktop.org>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        linux-kernel@...r.kernel.org, sultan@...neltoast.com
Subject: Re: [Intel-gfx] signal: break out of wait loops on kthread_stop()

On 19/10/2022 21:19, Jason A. Donenfeld wrote:
> On Wed, Oct 19, 2022 at 09:09:28PM +0100, Tvrtko Ursulin wrote:
>> Hm why is kthread_stop() after kthread_run() abuse? I don't see it in
>> kerneldoc that it must not be used for stopping threads.
> 
> Because you don't want it to stop. You want to wait until it's done. If
> you call stop right after run, it will even stop it before it even
> begins to run. That's why you wind up sprinkling your msleeps
> everywhere, indicating that clearly this is not meant to work that way.
Not after kthread_run which wakes it up already. If the kerneldoc for 
kthread_stop() is correct at least... In which case I really do think 
that the yields are pointless/red herring. Perhaps they predate 
kthread_run and then they were even wrong.

>> Yep the yields and sleeps are horrible and will go. But they are also
>> not relevant for the topic at hand.
> 
> Except they very much are. The reason you need these is because you're
> using kthread_stop() for something it's not meant to do.

It is supposed to assert kthread_should_stop() which thread can look at 
as when to exit. Except that now it can fail to get to that controlled 
exit point. Granted that argument is moot since it implies incomplete 
error handling in the thread anyway.

Btw there are actually two use cases in our code base. One is thread 
controls the exit, second is caller controls the exit. Anyway...

>> Never mind, I was not looking for anything more than a suggestion on how
>> to maybe work around it in piece as someone is dealing with the affected
>> call sites.
> 
> Sultan's kthread_work idea is probably the right direction. This would
> seem to have what you need.

... yes, it can be converted. Even though for one of the two use cases 
we need explicit signalling. There now isn't anything which would assert 
kthread_should_stop() without also asserting the signal, right?. Neither 
I found that the thread work API can do it.

Fingers crossed we were the only "abusers" of the API. There's a quite a 
number of kthread_stop callers and it would be a large job to audit them 
all.

Regards,

Tvrtko