[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YgzMTrVMCVt+n7cE@kroah.com>
Date: Wed, 16 Feb 2022 11:05:02 +0100
From: Greg KH <gregkh@...uxfoundation.org>
To: Brian Geffon <bgeffon@...gle.com>
Cc: Dave Hansen <dave.hansen@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Willis Kung <williskung@...gle.com>,
Guenter Roeck <groeck@...gle.com>,
Borislav Petkov <bp@...e.de>,
Andy Lutomirski <luto@...nel.org>,
"# v4 . 10+" <stable@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH stable 5.4,5.10] x86/fpu: Correct pkru/xstate
inconsistency
On Tue, Feb 15, 2022 at 04:32:48PM -0500, Brian Geffon wrote:
> On Tue, Feb 15, 2022 at 2:45 PM Greg KH <gregkh@...uxfoundation.org> wrote:
> >
> > On Tue, Feb 15, 2022 at 11:22:33AM -0800, Brian Geffon wrote:
> > > When eagerly switching PKRU in switch_fpu_finish() it checks that
> > > current is not a kernel thread as kernel threads will never use PKRU.
> > > It's possible that this_cpu_read_stable() on current_task
> > > (ie. get_current()) is returning an old cached value. To resolve this
> > > reference next_p directly rather than relying on current.
> > >
> > > As written it's possible when switching from a kernel thread to a
> > > userspace thread to observe a cached PF_KTHREAD flag and never restore
> > > the PKRU. And as a result this issue only occurs when switching
> > > from a kernel thread to a userspace thread, switching from a non kernel
> > > thread works perfectly fine because all that is considered in that
> > > situation are the flags from some other non kernel task and the next fpu
> > > is passed in to switch_fpu_finish().
> > >
> > > This behavior only exists between 5.2 and 5.13 when it was fixed by a
> > > rewrite decoupling PKRU from xstate, in:
> > > commit 954436989cc5 ("x86/fpu: Remove PKRU handling from switch_fpu_finish()")
> > >
> > > Unfortunately backporting the fix from 5.13 is probably not realistic as
> > > it's part of a 60+ patch series which rewrites most of the PKRU handling.
> > >
> > > Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
> > > Signed-off-by: Brian Geffon <bgeffon@...gle.com>
> > > Signed-off-by: Willis Kung <williskung@...gle.com>
> > > Tested-by: Willis Kung <williskung@...gle.com>
> > > Cc: <stable@...r.kernel.org> # v5.4.x
> > > Cc: <stable@...r.kernel.org> # v5.10.x
> > > ---
> > > arch/x86/include/asm/fpu/internal.h | 13 ++++++++-----
> > > arch/x86/kernel/process_32.c | 6 ++----
> > > arch/x86/kernel/process_64.c | 6 ++----
> > > 3 files changed, 12 insertions(+), 13 deletions(-)
> >
> > So this is ONLY for 5.4.y and 5.10.y? I'm really really loath to take
> > non-upstream changes as 95% of the time (really) it goes wrong.
>
> That's correct, this bug was introduced in 5.2 and that code was
> completely refactored in 5.13 indirectly fixing it.
What series of commits contain that work?
And again, why not just take them? What is wrong with that if this is
such a big issue?
> > How was this tested, and what do the maintainers of this subsystem
> > think? And will you be around to fix the bugs in this when they are
> > found?
>
> This has been trivial to reproduce, I've used a small repro which I've
> put here: https://gist.github.com/bgaff/9f8cbfc8dd22e60f9492e4f0aff8f04f
> , I also was able to reproduce this using the protection_keys self
> tests on a 11th Gen Core i5-1135G7. I'm happy to commit to addressing
> any bugs that may appear. I'll see what the maintainers say, but there
> is also a smaller fix that just involves using this_cpu_read() in
> switch_fpu_finish() for this specific issue, although that approach
> isn't as clean.
Can you add the test to the in-kernel tests so that we make sure it is
fixed and never comes back?
thanks,
greg k-h
Powered by blists - more mailing lists