[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e605082ac8361c1932bfddfe2055660c7cea5f2b.camel@redhat.com>
Date: Sun, 27 Mar 2022 18:06:07 +0300
From: Maxim Levitsky <mlevitsk@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Oliver Upton <oupton@...gle.com>,
Peter Shier <pshier@...gle.com>
Subject: Re: [PATCH 00/21] KVM: x86: Event/exception fixes and cleanups
On Thu, 2022-03-24 at 21:31 +0000, Sean Christopherson wrote:
> On Sun, Mar 13, 2022, Maxim Levitsky wrote:
> > On Fri, 2022-03-11 at 03:27 +0000, Sean Christopherson wrote:
> > > The main goal of this series is to fix KVM's longstanding bug of not
> > > honoring L1's exception intercepts wants when handling an exception that
> > > occurs during delivery of a different exception. E.g. if L0 and L1 are
> > > using shadow paging, and L2 hits a #PF, and then hits another #PF while
> > > vectoring the first #PF due to _L1_ not having a shadow page for the IDT,
> > > KVM needs to check L1's intercepts before morphing the #PF => #PF => #DF
> > > so that the #PF is routed to L1, not injected into L2 as a #DF.
> > >
> > > nVMX has hacked around the bug for years by overriding the #PF injector
> > > for shadow paging to go straight to VM-Exit, and nSVM has started doing
> > > the same. The hacks mostly work, but they're incomplete, confusing, and
> > > lead to other hacky code, e.g. bailing from the emulator because #PF
> > > injection forced a VM-Exit and suddenly KVM is back in L1.
> > >
> > > Everything leading up to that are related fixes and cleanups I encountered
> > > along the way; some through code inspection, some through tests (I truly
> > > thought this series was finished 10 commits and 3 days ago...).
> > >
> > > Nothing in here is all that urgent; all bugs tagged for stable have been
> > > around for multiple releases (years in most cases).
> > >
> > I am just curious. Are you aware that I worked on this few months ago?
>
> Ah, so that's why I had a feeling of deja vu when factoring out kvm_queued_exception.
> I completely forgot about it :-/ In my defense, that was nearly a year ago[1][2], though
> I suppose one could argue 11 == "a few" :-)
>
> [1] https://lore.kernel.org/all/20210225154135.405125-1-mlevitsk@redhat.com
> [2] https://lore.kernel.org/all/20210401143817.1030695-3-mlevitsk@redhat.com
>
> > I am sure that you even reviewed some of my code back then.
>
> Yep, now that I've found the threads I remember discussing the mechanics.
>
> > If so, could you have had at least mentioned this and/or pinged me to continue
> > working on this instead of re-implementing it?
>
> I'm invoking Hanlon's razor[*]; I certainly didn't intended to stomp over your
> work, I simply forgot.
Thank you very much for the explanation, and I am glad that it was a honest mistake.
Other than that I am actually very happy that you posted this patch series,
as this gives more chance that this long standing issue will be fixed,
and if your patches are better/simpler/less invasive to KVM and still address the issue,
I fully support using them instead of mine.
Totally agree with you about your thoughts about splitting pending/injected exception,
I also can't say I liked my approach that much, for the same reasons you mentioned.
It is also the main reason I put the whole thing on the backlog lately,
because I was feeling that I am changing too much of the KVM,
for a relatively theoretical issue.
I will review your patches, compare them to mine, and check if you or I missed something.
PS:
Back then, I also did an extensive review on few cases when qemu injects exceptions itself,
which it does thankfully rarely. There are several (theoretical) issues there.
I don't remember those details, I need to refresh my memory.
AFAIK, qemu injects #MC sometimes when it gets it from the kernel in form of a signal,
if I recall this correctly, and it also reflects back #DB, when guest debug was enabled
(and that is the reason for some work I did in this area, like the KVM_GUESTDBG_BLOCKIRQ thing)
Qemu does this without considering nested and/or pending exception/etc.
It just kind of abuses the KVM_SET_VCPU_EVENTS for that.
Best regards,
Maxim Levitsky
>
> As for the technical aspects, looking back at your series, I strongly considered
> taking the same approach of splitting pending vs. injected (again, without any
> recollection of your work). I ultimately opted to go with the "immediated morph
> to pending VM-Exit" approach as it allows KVM to do the right thing in almost every
> case without requiring new ABI, and even if KVM screws up, e.g. queues multiple
> pending exceptions. It also neatly handles one-off things like async #PF in L2.
>
> However, I hadn't considered your approach, which addresses the ABI conundrum by
> processing pending=>injected immediately after handling the VM-Exit. I can't think
> of any reason that wouldn't work, but I really don't like splitting the event
> priority logic, nor do I like having two event injection sites (getting rid of the
> extra calls to kvm_check_nested_events() is still on my wish list). If we could go
> back in time, I would likely vote for properly tracking injected vs. pending, but
> since we're mostly stuck with KVM's ABI, I prefer the "immediately morph to pending
> VM-Exit" hack over the "immediately morph to 'injected' exception" hack.
>
> [*] https://en.wikipedia.org/wiki/Hanlon%27s_razor
>
Powered by blists - more mailing lists