[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YZvloswO5g/o02V6@google.com>
Date: Mon, 22 Nov 2021 18:46:58 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Ben Gardon <bgardon@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, Peter Xu <peterx@...hat.com>,
Peter Shier <pshier@...gle.com>,
David Matlack <dmatlack@...gle.com>,
Mingwei Zhang <mizhang@...gle.com>,
Yulei Zhang <yulei.kernel@...il.com>,
Wanpeng Li <kernellwp@...il.com>,
Xiao Guangrong <xiaoguangrong.eric@...il.com>,
Kai Huang <kai.huang@...el.com>,
Keqian Zhu <zhukeqian1@...wei.com>,
David Hildenbrand <david@...hat.com>
Subject: Re: [PATCH 11/15] KVM: x86/MMU: Refactor vmx_get_mt_mask
On Mon, Nov 22, 2021, Ben Gardon wrote:
> On Fri, Nov 19, 2021 at 1:03 AM Paolo Bonzini <pbonzini@...hat.com> wrote:
> >
> > On 11/18/21 16:30, Sean Christopherson wrote:
> > > If we really want to make this state per-vCPU, KVM would need to incorporate the
> > > CR0.CD and MTRR settings in kvm_mmu_page_role. For MTRRs in particular, the worst
> > > case scenario is that every vCPU has different MTRR settings, which means that
> > > kvm_mmu_page_role would need to be expanded by 10 bits in order to track every
> > > possible vcpu_idx (currently capped at 1024).
> >
> > Yes, that's insanity. I was also a bit skeptical about Ben's try_get_mt_mask callback,
> > but this would be much much worse.
>
> Yeah, the implementation of that felt a bit kludgy to me too, but
> refactoring the handling of all those CR bits was way more complex
> than I wanted to handle in this patch set.
> I'd love to see some of those CR0 / MTRR settings be set on a VM basis
> and enforced as uniform across vCPUs.
Architecturally, we can't do that. Even a perfectly well-behaved guest will have
(small) periods where the BSP has different settings than APs. And it's technically
legal to have non-uniform MTRR and CR0.CD/NW configurations, even though no modern
BIOS/kernel does that. Except for non-coherent DMA, it's a moot point because KVM
can simply ignore guest cacheability settings.
> Looking up vCPU 0 and basing things on that feels extra hacky though,
> especially if we're still not asserting uniformity of settings across
> vCPUs.
IMO, it's marginally less hacky than what KVM has today as it allows KVM's behavior
to be clearly and sanely stated, e.g. KVM uses vCPU0's cacheability settings when
mapping non-coherent DMA. Compare that with today's behavior where the cacheability
settings depend on which vCPU first faulted in the address for a given MMU role and
instance of the associated root, and whether other vCPUs share an MMU role/root.
> If we need to track that state to accurately virtualize the hardware
> though, that would be unfortunate.
Powered by blists - more mailing lists