lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHmME9pBqGhCjdwx64GxYTKWiMkDNY3v2gnVL_Xm2q=3guOAsQ@mail.gmail.com>
Date:   Fri, 15 Jun 2018 15:11:15 +0200
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Andy Lutomirski <luto@...capital.net>
Subject: Lazy FPU restoration / moving kernel_fpu_end() to context switch

Hi Andy & folks,

Lots of crypto routines look like this:

kernel_fpu_begin();
encrypt();
kernel_fpu_end();

If you call such a routine twice, you get:

kernel_fpu_begin();
encrypt();
kernel_fpu_end();
kernel_fpu_begin();
encrypt();
kernel_fpu_end();

In a loop this looks like:

for (thing) {
  kernel_fpu_begin();
  encrypt(thing);
  kernel_fpu_end();
}

This is obviously very bad, because begin() and end() are slow, so
WireGuard does the obvious:

kernel_fpu_begin();
for (thing)
  encrypt(thing);
kernel_fpu_end();

This is fine and well, and the crypto API I'm working on will enable
this to be done in a clear way, but I do wonder if maybe this is not
something that should be happening at the level of the caller, but
rather in the fpu functions themselves. Namely, what are your thoughts
on modifying kernel_fpu_end() so that it doesn't actually restore the
state, but just reenables preemption and marks that on the next
context switch, the state should be restored? Then, essentially,
kernel_fpu_begin() and end() become free after the first usage of
kernel_fpu_begin().

Is this something feasible? I know that performance-wise, I'm really
gaining a lot from hoisting those functions out of the loops, and API
wise, it'd be slightly simpler to implement if I didn't have to all
for that hoisting.

Regards,
Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ