lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 18 Oct 2016 09:58:08 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     riel@...hat.com
Cc:     linux-kernel@...r.kernel.org, bp@...en8.de,
        torvalds@...ux-foundation.org, luto@...nel.org,
        dave.hansen@...el.linux.com, tglx@...utronix.de, hpa@...or.com
Subject: Re: [PATCH RFC 0/3] x86/fpu: defer FPU state loading until return to
 userspace


* riel@...hat.com <riel@...hat.com> wrote:

> These patches defer FPU state loading until return to userspace.
> 
> This has the advantage of not clobbering the FPU state of one task
> with that of another, when that other task only stays in kernel mode.
> 
> It also allows us to skip the FPU restore in kernel_fpu_end(), which
> will help tasks that do multiple invokations of kernel_fpu_begin/end
> without returning to userspace, for example KVM VCPU tasks.
> 
> We could also skip the restore of the KVM VCPU guest FPU state at
> guest entry time, if it is still valid, but I have not implemented
> that yet.
> 
> The code that loads FPU context directly into registers from user
> space memory, or saves directly to user space memory, is wrapped
> in a retry loop, that ensures the FPU state is correctly set up
> at the start, and verifies that it is still valid at the end.
> 
> I have stress tested these patches with various FPU test programs,
> and things seem to survive.
> 
> However, I have not found any good test suites that mix FPU
> use and signal handlers. Close scrutiny of these patches would
> be appreciated.

BTW., for the next version it would be nice to also have a benchmark that shows 
the advantages (and proves that it's not causing measurable overhead elsewhere).

Either an FPU-aware extension to 'perf bench sched' or a separate 'perf bench fpu' 
suite would be nice.

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ