[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YQ2vv7EXGN2jgQBb@google.com>
Date: Fri, 6 Aug 2021 21:55:11 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
Cc: kvm@...r.kernel.org,
"open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
<linux-kernel@...r.kernel.org>, Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, Borislav Petkov <bp@...en8.de>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>
Subject: Re: KVM's support for non default APIC base
On Thu, Jul 22, 2021, Maxim Levitsky wrote:
> On Mon, 2021-07-19 at 18:49 +0000, Sean Christopherson wrote:
> > On Sun, Jul 18, 2021, Maxim Levitsky wrote:
> -> APIC MMIO area has to be MMIO for 'apic_mmio_write' to be called,
> thus must contain no guest memslots.
> If the guest relocates the APIC base somewhere where we have a memslot,
> memslot will take priority, while on real hardware, LAPIC is likely to
> take priority.
Yep. The thing that really bites us is that other vCPUs should still be able to
access the memory defined by the memslot, e.g. to make it work we'd have to run
the vCPU with a completely different MMU root.
> As far as I know the only good reason to relocate APIC base is to access it
> from the real mode which is not something that is done these days by modern
> BIOSes.
>
> I vote to make it read only (#GP on MSR_IA32_APICBASE write when non default
> base is set and apic enabled) and remove all remains of the support for
> variable APIC base.
Making up our own behavior is almost never the right approach. E.g. _best_ case
scenario for an unexpected #GP is the guest immediately terminates. Worst case
scenario is the guest eats the #GP and continues on, which is basically the status
quo, except it's guaranteed to now work, whereas todays behavior can at least let
the guest function, for some definitions of "function".
I think the only viable "solution" is to exit to userspace on the guilty WRMSR.
Whether or not we can do that without breaking userspace is probably the big
question. Fully emulating APIC base relocation would be a tremendous amount of
effort and complexity for practically zero benefit.
> (we already have a warning when APIC base is set to non default value)
FWIW, that warning is worthless because it's _once(), i.e. won't help detect a
misbehaving guest unless it's the first guest to misbehave on a particular
instantiation of KVM. _ratelimited() would improve the situation, but not
completely eliminate the possibility of a misbehaving guest going unnoticed.
Anything else isn't an option becuase it's obviously guest triggerable.
Powered by blists - more mailing lists