[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4836F77C.309@zytor.com>
Date: Fri, 23 May 2008 09:57:32 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Roland McGrath <roland@...hat.com>
CC: Suresh Siddha <suresh.b.siddha@...el.com>,
Mikael Pettersson <mikpe@...uu.se>,
Andi Kleen <andi@...stfloor.org>, mingo@...e.hu,
tglx@...utronix.de, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org, drepper@...hat.com,
Hongjiu.lu@...el.com, linux-kernel@...r.kernel.org,
arjan@...ux.intel.com, rmk+lkml@....linux.org.uk, dan@...ian.org,
asit.k.mallick@...el.com
Subject: Re: [RFC] x86: xsave/xrstor support, ucontext_t extensions
Roland McGrath wrote:
>> What I was doing in the RFC is: restore the state what ever that was
>> present and init the state that was not present in the stack frame.
>
> That is consistent in spirit with the existing treatment of FPU data.
> That is, if the sigcontext.fpstate pointer is NULL, the thread's FPU
> state is reset to default. (And despite what hpa said about being
> "supported", the facts in the code are that sigreturn just follows the
> sigcontext.fpstate pointer, whatever it is. On 32-bit, the pointer is
> NULL in the context saved when the thread had not used the FPU, so
> modifying the sigcontext to include FP state when it didn't before
> requires putting in some user-chosen pointer. There in fact may well be
> existing code that does user-level coroutine switching using sigreturn
> and relies on this, for all we know.)
>
Okay. Pretty much what it comes down to is that there is no ideal
solution. Thus, we're trying to explore the potential tradeoffs. The
scenario you describe above will crash horribly for a non-FXSAVE aware
application running on an FXSAVE kernel.
Either way, there has been a long time since, and new bad applications
have obviously emerged, partially "assisted" by our propensity to not
document, and the deep gulf between our kernel and userspace developers.
Let's try another strawman on for size:
- It is clear it is desirable not just for the frame itself but for the
fpstate to be self-describing.
- Thus, let's put a magic cookie in one of the reserved fields at the
end of the FXSAVE region, and make sure it is long enough to be unlikely
to pop up randomly; as well as another magic cookie outside the FXSAVE
region.
- The signal delivery code will write the cookie (or zero, for !XSAVE)
regardless of any crap ptrace might have written into it.
- We will ALSO set bit 0 in uc_flags for RT sigframes as an additional
assurance.
- We will introduce at least a 32-bit field for future use, to be
written unconditionally zero for now. We don't want to have to go
through this particular torture yet again.
- The XSAVE state beyond the FXSAVE region needs to be self-describing.
This may mean adding information not provided by the hardware.
Furthermore, it must be possible for userspace to know the length of the
frame, even if it doesn't understand its detailed contents.
None of this is foolproof on older kernels -- there simply *IS* no
option for older kernels that is 100% guaranteed, thanks to various
assumptions made and design decisions taken over the years. There are a
couple of failure scenarious here:
- XSAVE-aware application running on pre-XSAVE kernel:
Such an application will be aware that the XSAVE information may not
exist, but needs to know (with high probability) that it isn't
present. We have CPUID.OSXSAVE, the uc_flags bit, and the magic
cookie to help here. ptrace can introduce the magic cookie falsely
into the state, but ptrace can introduce all kinds of failures;
either way they would (probablistically) not see the *second*
cookie.
The fact that 64-bit kernels don't clear the unused fields is of
less concern, since 64-bit kernels get the uc_flags field.
- Pre-XSAVE application running on XSAVE kernel with XSAVE enabled:
Here we have the potential for all kinds of corrupt state, including
userspace trying to save away the state and load it later, not
knowing the proper size of it. Worse, some sick person might try to
save and restore state from different hosts, with potential for
all kinds of mayhem.
The saved state, if copied from the original, would contain the first
cookie but not the second cookie.
Again, the use of two cookies here adds some amount of assurance;
but that again amounts to probabilistic failure detection. However,
I personally don't see any way to avoid that scenario at all.
-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists