lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 15 Jun 2018 21:34:38 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     "Jason A. Donenfeld" <Jason@...c4.com>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Andy Lutomirski <luto@...capital.net>
Subject: Re: Lazy FPU restoration / moving kernel_fpu_end() to context switch

On Fri, Jun 15, 2018 at 06:25:39PM +0200, Thomas Gleixner wrote:
> On Fri, 15 Jun 2018, Jason A. Donenfeld wrote:
> > In a loop this looks like:
> > 
> > for (thing) {
> >   kernel_fpu_begin();
> >   encrypt(thing);
> >   kernel_fpu_end();
> > }
> > 
> > This is obviously very bad, because begin() and end() are slow, so
> > WireGuard does the obvious:
> > 
> > kernel_fpu_begin();
> > for (thing)
> >   encrypt(thing);
> > kernel_fpu_end();
> > 
> > This is fine and well, and the crypto API I'm working on will enable
> 
> It might be fine crypto performance wise, but it's a total nightmare
> latency wise because kernel_fpu_begin() disables preemption. We've seen
> latencies in the larger millisecond range due to processing large data sets
> with kernel FPU.
> 
> If you want to go there then we really need a better approach which allows
> kernel FPU usage in preemptible context and in case of preemption a way to
> stash the preempted FPU context and restore it when the task gets scheduled
> in again. Just using the existing FPU stuff and moving the loops inside the
> begin/end section and keeping preemption disabled for arbitrary time spans
> is not going to fly.

Didn't we recently do a bunch of crypto patches to help with this?

I think they had the pattern:

	kernel_fpu_begin();
	for (units-of-work) {
		do_unit_of_work();
		if (need_resched()) {
			kernel_fpu_end();
			cond_resched();
			kernel_fpu_begin();
		}
	}
	kernel_fpu_end();

I'd have to go dig out the actual series, but I think they had a bunch
of helpers to deal with that.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ