[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SA1PR11MB673408E50152DFAB60800DA0A8819@SA1PR11MB6734.namprd11.prod.outlook.com>
Date: Tue, 21 Mar 2023 07:49:37 +0000
From: "Li, Xin3" <xin3.li@...el.com>
To: "andrew.cooper3@...rix.com" <andrew.cooper3@...rix.com>,
Peter Zijlstra <peterz@...radead.org>,
Lai Jiangshan <jiangshanlai@...il.com>
CC: "H. Peter Anvin" <hpa@...or.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"Christopherson,, Sean" <seanjc@...gle.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"Shankar, Ravi V" <ravi.v.shankar@...el.com>
Subject: RE: [PATCH v5 22/34] x86/fred: FRED initialization code
> >>> If there is no other concrete reason other than overflowing for
> >>> assigning NMI and #DB with a stack level > 0, #VE should also be
> >>> assigned with a stack level > 0, and #BP too. #VE can happen anytime
> >>> and anywhere, so it is subject to overflowing too.
> >> So #BP needs the stack-gap (redzone) for text_poke_bp().
> >>
> >> #BP can end up in kprobes which can then end up in ftrace/perf, depending
> on
> >> how it's all wired up.
> >>
> >> #VE is currently a trainwreck vs NMI/MCE, but I think FRED solves the worst of
> >> that. I'm not exactly sure how deep the #VE handler goes.
> >>
> > VE under IDT is *not* using an IST, we need some solid rationales here.
>
> #VE, and #VC on AMD, are borderline unusable. Both under IDT and FRED.
Oops!
> The reason #VE is not IST is because there are plenty of real cases
> where a non-malicious outer hypervisor could create reentrant faults
> that lose program state. e.g. hitting an IO instruction, then hitting
> an emulated MSR.
>
> There are fewer cases where a non-IST #VE ends up in a re-entrant fault
> (IIRC, you can still manage it by unmapping the entry stack), but you're
> still trusting the outer hypervisor to not e.g. unmap the SYSCALL entry
> point.
>
> FRED gets rid of the "reentrant fault overwriting it on the stack" case,
> and removes the syscall gap case, replacing them instead with a stack
> overflow in the worst case because there is still no upper bound to how
> many times #VE can actually be delivered in the course of servicing a
> single #VE.
Exactly, FRED stack levels can make use of the whole regular stack space.
I guess you don't seem to support #VE on a higher stack level?
> ~Andrew
>
> P.S. While I hate to cite myself, if you haven't read
> https://docs.google.com/document/d/1hWejnyDkjRRAW-
> JEsRjA5c9CKLOPc6VKJQsuvODlQEI/edit?usp=sharing
> yet, do so. It did feed into some of the FRED design.
Powered by blists - more mailing lists