[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210519193158.GJ21560@worktop.programming.kicks-ass.net>
Date: Wed, 19 May 2021 21:31:58 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Joerg Roedel <jroedel@...e.de>
Cc: Joerg Roedel <joro@...tes.org>, x86@...nel.org,
Hyunwook Baek <baekhw@...gle.com>, hpa@...or.com,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Jiri Slaby <jslaby@...e.cz>,
Dan Williams <dan.j.williams@...el.com>,
Tom Lendacky <thomas.lendacky@....com>,
Juergen Gross <jgross@...e.com>,
Kees Cook <keescook@...omium.org>,
David Rientjes <rientjes@...gle.com>,
Cfir Cohen <cfir@...gle.com>,
Erdem Aktas <erdemaktas@...gle.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mike Stunes <mstunes@...are.com>,
Sean Christopherson <seanjc@...gle.com>,
Martin Radev <martin.b.radev@...il.com>,
Arvind Sankar <nivedita@...m.mit.edu>,
linux-coco@...ts.linux.dev, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH v2 5/8] x86/sev-es: Leave NMI-mode before sending signals
On Wed, May 19, 2021 at 09:13:08PM +0200, Joerg Roedel wrote:
> Hi Peter,
>
> thanks for your review.
>
> On Wed, May 19, 2021 at 07:54:50PM +0200, Peter Zijlstra wrote:
> > On Wed, May 19, 2021 at 03:52:48PM +0200, Joerg Roedel wrote:
> > > --- a/arch/x86/kernel/sev.c
> > > +++ b/arch/x86/kernel/sev.c
> > > @@ -1343,9 +1343,10 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
> > > return;
> > > }
> > >
> > > + instrumentation_begin();
> > > +
> > > irq_state = irqentry_nmi_enter(regs);
> > > lockdep_assert_irqs_disabled();
> > > - instrumentation_begin();
> > >
> > > /*
> > > * This is invoked through an interrupt gate, so IRQs are disabled. The
> >
> > That's just plain wrong. No instrumentation is allowed before you enter
> > the exception context.
>
> Okay.
>
> > > + irqentry_nmi_exit(regs, irq_state);
> > > +
> >
> > And this is wrong too; because at this point the handler doesn't run in
> > _any_ context anymore, certainly not one you can call regular C code
> > from.
>
> The #VC handler is at this point not running on the IST stack anymore,
> but on the stack it came from or on the task stack. So my believe was
> that at this point it inherits the context it came from (just like the
> page-fault handler). But I also don't fully understand the context
> tracking, so is my assumption wrong?
Being on the right stack is only part of the issue; you also need to
make sure your runtime environment is set up.
Regular kernel C expects a whole lot of things to be present; esp. so
with all the debug options on. The irqentry_*_enter() family of
functions very carefully sets up this environment and the
irqentry_*_exit() undoes it again. Before and after you really cannot
run normal code.
Just an example, RCU might not be watching, it might think the CPU is in
userspace and advance the GP while you're relying on it not doing so.
Similarly lockdep is in some undefined state and any lock used can
trigger random 'funny' things.
Just because this is 'C', doesn't immediately mean you can go call any
random function. Up until recently most of this was in ASM. There's a
reason for the noinstr annotations.
Powered by blists - more mailing lists