linux-kernel - Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 2 May 2016 20:32:16 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Dave Hansen <dave.hansen@...ux.intel.com>
Cc:	Andy Lutomirski <luto@...capital.net>,
	Yu-cheng Yu <yu-cheng.yu@...el.com>, X86 ML <x86@...nel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Andy Lutomirski <luto@...nel.org>,
	Borislav Petkov <bp@...e.de>,
	Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>,
	"Ravi V. Shankar" <ravi.v.shankar@...el.com>,
	Fenghua Yu <fenghua.yu@...el.com>
Subject: Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues


* Dave Hansen <dave.hansen@...ux.intel.com> wrote:

> On 04/30/2016 12:53 AM, Ingo Molnar wrote:
> > We can still use the compacted area handling instructions, because presumably 
> > those are the fastest and are also the most optimized ones? But I wouldn't use 
> > them to do dynamic allocation: just allocate the maximum possible FPU save area at 
> > task creation time and never again worry about that detail.
> > 
> > Ok?
> 
> Sounds sane to me.
> 
> BTW, I hacked up your "fpu performance" to compare XSAVE vs. XSAVES:
> 
> > [    0.048347] x86/fpu: Cost of: XSAVE                       insn          :   127 cycles
> > [    0.049134] x86/fpu: Cost of: XSAVES                      insn          :   113 cycles
> > [    0.048492] x86/fpu: Cost of: XRSTOR                      insn          :   120 cycles
> > [    0.049267] x86/fpu: Cost of: XRSTORS                     insn          :   102 cycles
> 
> So I guess we can add that to the list of things that XSAVES is good for.

Absolutely!

> [...]  Granted, the real-world benefit is probably hard to measure because the 
> cache residency of the XSAVE buffer isn't as good when _actually_ context 
> switching, but this at least shows a small theoretical advantage for XSAVES.

Yeah, and anything that was measured for real is far from being theoretical. It's 
simply a best-case microbenchmark figure, but it's still a nice 10+ cycles 
improvement overall - which might become bigger in future CPU generations.

Thanks,

	Ingo