lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXrWR92L+JP1gaotpjkWd2TowjE9OrJTcj254L=wJfxMw@mail.gmail.com>
Date:   Fri, 15 Jun 2018 11:49:50 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Dave Hansen <dave.hansen@...ux.intel.com>
Cc:     Andrew Lutomirski <luto@...nel.org>,
        "Jason A. Donenfeld" <Jason@...c4.com>,
        Rik van Riel <riel@...riel.com>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>
Subject: Re: Lazy FPU restoration / moving kernel_fpu_end() to context switch

On Fri, Jun 15, 2018 at 11:43 AM Dave Hansen
<dave.hansen@...ux.intel.com> wrote:
>
> On 06/15/2018 11:31 AM, Andy Lutomirski wrote:
> > Using WRPKRU is easy, but,
> > unless we do something very clever, actually finding PKRU in the
> > in-memory fpstate image may be slightly nontrivial.
>
> Why?
>
> It's at a constant offset during any one boot, for sure.  XSAVEC/XSAVES
> can move it around, but only if you change the Requested Feature BitMap
> (RFBM) that we pass to XSAVES or change the control register that
> enables XSAVE states (XCR0).
>
> We don't change XCR0 after boot, and RFBM is hard-coded to -1 as far as
> I remember.

I thought that XSAVES didn't allocate space for parts of the state
that are in the init state. I guess I was wrong :)

So never mind, context switch should just need to WRPKRU the field at
the predetermined offset in fpstate for the new task.  And, if some
appropriate debug option is set, it should warn if TIF_FPU_UNLOADED is
clear for the new task.

I suspect that the whole patch should only be a couple hundred lines
of code.  And I think there are VM workloads where it would be a
*huge* win.

--Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ