linux-kernel - Re: [Intel-gfx] signal: break out of wait loops on kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHmME9pShsMrXwMhA+4FJKkc-nqCE74UQMXYy9fuEsoiDS2G=A@mail.gmail.com>
Date:   Mon, 28 Nov 2022 19:27:18 +0100
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
        "Intel-gfx@...ts.freedesktop.org" <Intel-gfx@...ts.freedesktop.org>,
        linux-kernel@...r.kernel.org, sultan@...neltoast.com
Subject: Re: [Intel-gfx] signal: break out of wait loops on kthread_stop()

Hi Eric,

On Mon, Nov 28, 2022 at 7:22 PM Eric W. Biederman <ebiederm@...ssion.com> wrote:
>
> Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com> writes:
>
> > On 19/10/2022 21:19, Jason A. Donenfeld wrote:
> >> On Wed, Oct 19, 2022 at 09:09:28PM +0100, Tvrtko Ursulin wrote:
> >>> Hm why is kthread_stop() after kthread_run() abuse? I don't see it in
> >>> kerneldoc that it must not be used for stopping threads.
> >> Because you don't want it to stop. You want to wait until it's done. If
> >> you call stop right after run, it will even stop it before it even
> >> begins to run. That's why you wind up sprinkling your msleeps
> >> everywhere, indicating that clearly this is not meant to work that way.
> > Not after kthread_run which wakes it up already. If the kerneldoc for
> > kthread_stop() is correct at least... In which case I really do think
> > that the yields are pointless/red herring. Perhaps they predate kthread_run and
> > then they were even wrong.
> >
> >>> Yep the yields and sleeps are horrible and will go. But they are also
> >>> not relevant for the topic at hand.
> >> Except they very much are. The reason you need these is because you're
> >> using kthread_stop() for something it's not meant to do.
> >
> > It is supposed to assert kthread_should_stop() which thread can look at as when
> > to exit. Except that now it can fail to get to that controlled exit
> > point. Granted that argument is moot since it implies incomplete error handling
> > in the thread anyway.
> >
> > Btw there are actually two use cases in our code base. One is thread controls
> > the exit, second is caller controls the exit. Anyway...
> >
> >>> Never mind, I was not looking for anything more than a suggestion on how
> >>> to maybe work around it in piece as someone is dealing with the affected
> >>> call sites.
> >> Sultan's kthread_work idea is probably the right direction. This would
> >> seem to have what you need.
> >
> > ... yes, it can be converted. Even though for one of the two use cases we need
> > explicit signalling. There now isn't anything which would assert
> > kthread_should_stop() without also asserting the signal, right?. Neither
> > I found that the thread work API can do it.
> >
> > Fingers crossed we were the only "abusers" of the API. There's a quite a number
> > of kthread_stop callers and it would be a large job to audit them all.
>
>
> I have been out and am coming to this late.   Did this get resolved?
>
>
> I really don't expect this affected much of anything else as the code
> sat in linux-next for an entire development cycle before being merged.
>
> But I would like to make certain problems with this change were resolved.

I just checked drm-next, and it looks like the i915 people resolved
their issue, and also got rid of those pesky yield()s in the process:
https://cgit.freedesktop.org/drm/drm/commit/?id=6407cf533217e09dfd895e64984c3f1ee3802373

Jason