linux-kernel - Re: x86: xsave/xrstor support; ucontext

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4849C257.40603@zytor.com>
Date:	Fri, 06 Jun 2008 16:03:51 -0700
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Suresh Siddha <suresh.b.siddha@...el.com>
CC:	Roland McGrath <roland@...hat.com>,
	Mikael Pettersson <mikpe@...uu.se>,
	Andi Kleen <andi@...stfloor.org>, x86@...nel.org,
	torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
	drepper@...hat.com, Hongjiu.lu@...el.com,
	linux-kernel@...r.kernel.org, arjan@...ux.intel.com,
	rmk+lkml@....linux.org.uk, dan@...ian.org, asit.k.mallick@...el.com
Subject: Re: x86: xsave/xrstor support; ucontext_t extensions

Suresh Siddha wrote:
> I thought we had a closure on the previous thread. But no problem.
> It's better late than never.

I apologize.  The last two months have been exceptionally tough.

> xsave,xrstor are performance senstive instructions as they are used
> in process context switches. It doesn't have to describe itself and
> at any time, one can get all the xsave relevant layout information using
> cpuid. And when needed, SW can always pass extra information with the
> xsave image.

Yes, the big problem with it is its monolithic nature (with no realistic 
alternate instruction.)

>>   - Magic number (M2)
> 
> As I mentioned earlier, we can avoid this magic number, by including
> a pointer (which points to start of the fp and xstate on stack) along with M1. 

As I mentioned before, this introduces a very different constraint, 
which is a really bad precedent; data shouldn't be dependent on its own 
location.

> This will catch any one copying the FP state of the frame but not aware of
> Xstate.
> 
>>   - Descriptor count (DC)
>>   - DC * <EBX, EAX> from CPUID leaf 0Dh
> 
> As you mentioned, this doesn't change after a kernel boot. So do we really
> need to save this static information on every signal? (also please see below
> about the compaction).

I think given the compaction constraint we're okay with the bitmap plus 
length of the area.

>>   - Possibly a checksum or CRC of this structure
>>
>> Note that this tail structure will always be the same on a given kernel, 
>> so it can be pre-canned at boot time.  This tail structure serves two 
>> purposes:
>>
>>   - It can be used to verify against truncation of the state.
>>     (I.e. if an XSAVE-unaware application tries to copy and save away
>>     a state and later restore it, but only copies the first 512 bytes
>>     and later just puts a pointer to it.)
> 
> As I mentioned above, pointer along with M1 should be enough to catch this?

I think the pointer is a really really bad idea.  Even if we don't need 
the structure I think having a tail magic is the better alternative, and 
it's also really cheap to do since you have to have the length pointer 
anyway.

>>   - It can be used to verify against an alien state (saved and restored
>>     from another CPU, or even just another kernel version with different
>>     support.)
> 
> Though the xsave layout is extendable, save area is not
> compacted if some features are not supported by processor and/or
> system software. This is documented in Vol 2b under "xsave"
> instruction.

Ah, you're right, my bad.  That does make the problem substantially 
simpler (I somehow read only the second half of the and/or clause, but 
it's all there.)  So, OK, no need for descriptors (he says, as he waits 
for the architectural shoe to drop, especially in a multivendor 
environment.)

> Given that the descriptor offsets don't change, we can
> achieve the same thing with a bit mask representing the state in
> the xsave layout. xrstor with the approriate bit masks will automatically
> restore/init the state.

Agreed.

>>   - Mismatch on descriptor sizes:
>>     -> Consider that region corrupt and reinitialize?
>>
>> The region-by-region copy could of course be used even in the same-CPU 
>> case, if there turns out to be a negible performance difference over 
>> whole-block copy.
> 
> Today in 64bit, we directly do fxsave/fxrstor in and out of user-space
> for signal handlers. I would like to retain this behavior as much as possible
> with xsave/xrstor aswell (and at the same time, provide as much information
> as possible for the user to interpret the signal frame). Bit mask representing
> the state saved in the xsave image, M1, length and some cookie (pointer along
> with M1) to detect the image truncation can achieve this. Isn't it?

If the state is complete, which it of course will be something like 
99.9999% of the time, then doing XRSTOR from user space should work just 
fine.  The case of having to stitch up state is clearly an exceptional 
case, which is not at all performance critical in any way.

	-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/