lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADyq12yGvqbb3+hp6f39RqyEM3Mu896yY6ik7Lh39W=o44bYbA@mail.gmail.com>
Date:   Thu, 17 Feb 2022 08:31:12 -0500
From:   Brian Geffon <bgeffon@...gle.com>
To:     Dave Hansen <dave.hansen@...el.com>
Cc:     Greg KH <gregkh@...uxfoundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Willis Kung <williskung@...gle.com>,
        Guenter Roeck <groeck@...gle.com>,
        Borislav Petkov <bp@...e.de>,
        Andy Lutomirski <luto@...nel.org>,
        "# v4 . 10+" <stable@...r.kernel.org>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH stable 5.4,5.10] x86/fpu: Correct pkru/xstate inconsistency

On Wed, Feb 16, 2022 at 10:16 AM Dave Hansen <dave.hansen@...el.com> wrote:
>
> On 2/16/22 02:05, Greg KH wrote:
> >>> How was this tested, and what do the maintainers of this subsystem
> >>> think?  And will you be around to fix the bugs in this when they are
> >>> found?
> >> This has been trivial to reproduce, I've used a small repro which I've
> >> put here: https://gist.github.com/bgaff/9f8cbfc8dd22e60f9492e4f0aff8f04f
> >> , I also was able to reproduce this using the protection_keys self
> >> tests on a 11th Gen Core i5-1135G7. I'm happy to commit to addressing
> >> any bugs that may appear. I'll see what the maintainers say, but there
> >> is also a smaller fix that just involves using this_cpu_read() in
> >> switch_fpu_finish() for this specific issue, although that approach
> >> isn't as clean.
> > Can you add the test to the in-kernel tests so that we make sure it is
> > fixed and never comes back?
>
> It would be great if Brian could confirm this.  But, I'm 99% sure that
> this can be reproduced in the vm/protection_keys.c selftest, if you run
> it for long enough.

Hi Dave,
Yes, this is reproduced by the protection keys selfs tests. If your
kernel was compiled in a way which caches current_task when read via
this_cpu_read_stable(), then when switching from a kernel thread to a
user thread you will observe this behavior, so the only situation
where it's timing related is waiting for that switch from a kernel to
user thread. If your kernel is compiled in a way which does not cache
the value of current_task then you will never observe this behavior.
My understanding is that this is perfectly valid for the compiler to
produce that code.

>
> The symptom here is corruption of the PKRU register.  I created *lots*
> of bugs like this during protection keys development so the selftest
> keeps a shadow copy of the register to specifically watch for corruption.
>
> It's _plausible_ that no one ever ran the pkey selftests with a
> clang-compiled kernel for long enough to hit this issue.

For ChromeOS we use clang and when I tested a vanilla 5.10 kernel
_built with clang_ it also reproduced, so I suspect you're right that
the self tests just haven't been run against clang built kernels all
that often.

How would you and Greg KH like to proceed with this? I'm happy to help
however I can.

Brian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ