[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ab5bfeb4-af66-35b4-40da-829c7f98dcc2@intel.com>
Date: Fri, 27 Aug 2021 13:25:29 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Borislav Petkov <bp@...en8.de>,
"Yu, Yu-cheng" <yu-cheng.yu@...el.com>
Cc: x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-arch@...r.kernel.org, linux-api@...r.kernel.org,
Arnd Bergmann <arnd@...db.de>,
Andy Lutomirski <luto@...nel.org>,
Balbir Singh <bsingharora@...il.com>,
Cyrill Gorcunov <gorcunov@...il.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Eugene Syromiatnikov <esyr@...hat.com>,
Florian Weimer <fweimer@...hat.com>,
"H.J. Lu" <hjl.tools@...il.com>, Jann Horn <jannh@...gle.com>,
Jonathan Corbet <corbet@....net>,
Kees Cook <keescook@...omium.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Nadav Amit <nadav.amit@...il.com>,
Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
Peter Zijlstra <peterz@...radead.org>,
Randy Dunlap <rdunlap@...radead.org>,
"Ravi V. Shankar" <ravi.v.shankar@...el.com>,
Dave Martin <Dave.Martin@....com>,
Weijiang Yang <weijiang.yang@...el.com>,
Pengfei Xu <pengfei.xu@...el.com>,
Haitao Huang <haitao.huang@...el.com>,
Rick P Edgecombe <rick.p.edgecombe@...el.com>
Subject: Re: [PATCH v29 23/32] x86/cet/shstk: Add user-mode shadow stack
support
On 8/27/21 11:21 AM, Borislav Petkov wrote:
> On Fri, Aug 27, 2021 at 11:10:31AM -0700, Yu, Yu-cheng wrote:
>> Because on context switches the whole xstates are switched together,
>> we need to make sure all are in registers.
> There's context switch code which does that already.
>
> Why would shstk_setup() be responsible for switching the whole extended
> states buffer instead of only the shadow stack stuff only?
I don't think this has anything to do with context-switching, really.
The code lands in shstk_setup() which wants to make sure that the new
MSR values are set before the task goes out to userspace. If
TIF_NEED_FPU_LOAD was set, it could do that by going out to the XSAVE
buffer and setting the MSR state in the buffer. Before returning to
userspace, it would be XRSTOR'd. A WRMSR by itself would not be
persistent because that XRSTOR would overwrite it.
But, if TIF_NEED_FPU_LOAD is *clear* it means the XSAVE buffer is
out-of-date and the registers are live. WRMSR can be used and there
will be a XSAVE* to the task buffer during a context switch.
So, this code takes the coward's way out: it *forces* TIF_NEED_FPU_LOAD
to be clear by making the registers live with fpregs_restore_userregs().
That lets it just use WRMSR instead of dealing with the XSAVE buffer
directly. If it didn't do this with the *WHOLE* set of user FPU state,
we'd need more fine-granted "NEED_*_LOAD" tracking than our one FPU bit.
This is also *only* safe because the task is newly-exec()'d and the FPU
state was just reset. Otherwise, we might have had to worry that the
non-PL3 SSPs have garbage or that non-SHSTK bits are set in MSR_IA32_U_CET.
That said, after staring at it, I *think* this code is functionally
correct and OK performance-wise. I suspect that the (very blunt) XRSTOR
inside of start_update_msrs()->fpregs_restore_userregs() is quite rare
because TIF_NEED_FPU_LOAD will usually be clear due to the proximity to
execve(). So, adding direct XSAVE buffer manipulation would probably
only make it more error prone.
Powered by blists - more mailing lists