[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3babf003-6854-e50a-34ca-c87ce4169c77@citrix.com>
Date: Mon, 24 Aug 2020 14:52:01 +0100
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>
CC: <x86@...nel.org>, Linus Torvalds <torvalds@...ux-foundation.org>,
"Tom Lendacky" <thomas.lendacky@....com>, Pu Wen <puwen@...on.cn>,
"Stephen Hemminger" <sthemmin@...rosoft.com>,
Sasha Levin <alexander.levin@...rosoft.com>,
Dirk Hohndel <dirkhh@...are.com>,
Jan Kiszka <jan.kiszka@...mens.com>,
Tony W Wang-oc <TonyWWang-oc@...oxin.com>,
"H. Peter Anvin" <hpa@...ux.intel.com>,
Asit Mallick <asit.k.mallick@...el.com>,
Gordon Tetlow <gordon@...lows.org>,
David Kaplan <David.Kaplan@....com>,
"Tony Luck" <tony.luck@...el.com>,
Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [RFD] x86: Curing the exception and syscall trainwreck in
hardware
On 24/08/2020 13:24, Thomas Gleixner wrote:
> It's a sad state of affairs that I have to write this mail at all and it's
> nothing else than an act of desperation.
>
> The x86 exception handling including the various ways of syscall entry/exit
> are a constant source of trouble. Aside of being a functional disaster
> quite some of these issues have severe security implications.
>
> There are similar issues on the virtualization side including the handling
> of essential MSRs which are required to run a guest OS and even more so
> with the upcoming virt specific exceptions of various vendors.
>
> We are asking the vendors for more than a decade to fix this situation, but
> even the most trivial requests like an IRET variant which does not reenable
> NMIs unconditionally and other small things which would make our life less
> miserable aren't happening.
>
> Instead of fixing the underlying design fails first and creating a solid
> base the vendors add even more ill defined exception variants on top of
> the existing pile. Unsurprisingly these add-ons are creating more
> problems than they solve, but being based on the existing house of cards
> that's obviously expected.
>
> This really has to stop and the underlying issues have to be resolved
> before more problems are inflicted upon operating systems and hypervisors.
> The amount of code to workaround these issues is already by far larger than
> the actual functional code. Some of these workarounds are just bandaids
> which try to prevent the most obvious damage, but they are mostly based on
> the hope that the unfixable corner cases never happen.
>
> There is talk about solutions for years, but it's just talk and we have not
> yet seen a coordinated effort accross the x86 vendors to come up with a
> sane replacement for the x86 exception and syscall trainwreck.
>
> The important word here is 'coordinated'. We are not at all interested
> in different solutions from different vendors. It's going to be
> challenging enough to maintain ONE parallel exception/syscall handling
> implementation. In other words, the kernel is going to support exactly
> ONE new exception/syscall handling mechanism and not going to accomodate
> every vendor.
>
> So I call on the x86 vendors to sit together and come up with a unified
> and consolidated base on which each of the vendors can build their
> differentiating features.
>
> Aside of coordination between the x86 vendors this also requires
> coordination with the people who finally have to deal with that on the
> software side. The prevailing hardware engineering principle "That can
> be fixed in software" does not work; it never worked - especially not in
> the area of x86 exception and syscall handling.
>
> This coordination must include all major operating systems and hypervisors
> whether open source or proprietary to ensure that the different
> requirements are met. This kind of coordination has happened in the context
> of the hardware vulnerability mitigations already in a fruitful way so
> this request is not asking for something impossible.
>
> If the x86 vendors are unable to talk to each other and coordinate on a
> solution, then the ultimate backstop might be to take the first reasonable
> design specification and the first reasonable silicon implementation of it
> as the ONE alternative solution to the existing trainwreck. How the other
> vendors are going to deal with that is none of our business. That's the
> least useful and least desired outcome and will only happen when the x86
> vendors are not able to get their act together and sort that out upfront.
And to help with coordination, here is something prepared (slightly)
earlier.
https://docs.google.com/document/d/1hWejnyDkjRRAW-JEsRjA5c9CKLOPc6VKJQsuvODlQEI/edit?usp=sharing
This identifies the problems from software's perspective, along with
proposing behaviour which ought to resolve the issues.
It is still a work-in-progress. The #VE section still needs updating in
light of the publication of the recent TDX spec.
Review and feedback welcome.
Thanks,
~Andrew
Download attachment "x86 Stack Switching - draft 2.1.pdf" of type "application/pdf" (108930 bytes)
Powered by blists - more mailing lists