[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220628141656.cf2jyrelhcylkpfp@wittgenstein>
Date: Tue, 28 Jun 2022 16:16:56 +0200
From: Christian Brauner <brauner@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Tejun Heo <tj@...nel.org>, Petr Mladek <pmladek@...e.com>,
Lai Jiangshan <jiangshanlai@...il.com>,
Michal Hocko <mhocko@...e.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Oleg Nesterov <oleg@...hat.com>
Subject: Re: re. Spurious wakeup on a newly created kthread
On Sat, Jun 25, 2022 at 11:43:15AM -0700, Linus Torvalds wrote:
> On Sat, Jun 25, 2022 at 11:25 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > And that's not at all what the kthread code wants. It wants to set
> > affinity masks, it wants to create a name for the thread, it wants to
> > do all those other things.
> >
> > That code really wants to just do copy_process().
>
> Honestly, I think kernel/kthread.c should be almost rewritten from scratch.
>
> I do not understand why it does all those odd keventd games at all,
> and why kthread_create_info exists in the first place.
>
> Why does kthread_create() not just create the thread directly itself,
> and instead does that odd queue it onto a work function?
>
> Some of that goes back to before the git history, and very little of
> it seems to make any sense. It's as if the code is meant to be able to
> run from interrupt context, but that can't be it: it's literally doing
> a GFP_KERNEL kmalloc, it's doing spin-locks without irq safety etc.
>
> So why is it calling kthreadd_task() to create the thread? Purely for
> some crazy odd "make that the parent" reason?
>
> I dunno. The code is odd, unexplained, looks buggy, and most fo the
> reasons are probably entirely historical.
>
> I'm adding Christian to this thread too, since I get the feeling that
> it really should be more tightly integrated with copy_process(), and
> that Christian might have comments.
>
> Christian, see some context in the thread here:
>
> https://lore.kernel.org/all/CAHk-=wiC7rj1o7vTnYUPfD7YxAu09MZiZbahHqvLm9+Cgg1dFw@mail.gmail.com/
>
> for some of this.
Sorry, I was at LSS last week.
I honestly didn't touch the code back then because it seemed almost
entirely unrelated to regular task creation apart from kernel_thread()
that I added. I didn't feel comfortable changing a lot of stuff there.
Iirc, just a few months ago io_uring still made us of the kthread
infrastructure and I think that made the limits of the interface more
obvious. Now we soon will have two users that create a version of kernel
generated threads with properties of another process (io_uring and [1]).
In my head, the kthread infra should be able to support generation of
pure kernel threads as well as the creation of users workers instead of
adding specialized interfaces to do this. The fact that it doesn't is a
limitation of the interface that imho shows it hasn't grown to adapt to
the new use-cases we have. And imho we'll see more of those.
In this context it's really worth looking at [1] because to some extent
it duplicates bits we have for the kthread infra whereas I still think
the kthread infra should support both possibly exposing two apis one
to return pure kernel threads and the other returning struct user_worker
or similar. Idk, it might just be a heat-stroke talking...
I don't feel comfortable making strong assertions about the original
implementation of kthreads. I wasn't around and there might be
historical context I'm missing.
One issue that Tejun also mentioned later in the thread and that we run
into is that we have a pattern where we create a kthread and then trust
the caller to handle/activate the new task. This is more problematic
once we start supporting something like [1] where that's exposed to a
driver. (Ideally creation of such a task would generate a unique
callback - I think Peter suggested something like this? - that could
only be used on that task...)
[1]: https://lore.kernel.org/lkml/20220620011357.10646-1-michael.christie@oracle.com
Powered by blists - more mailing lists