[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b44f75f9d6c66f33cab85cbe463cc388d48ac7eb.camel@infradead.org>
Date: Mon, 14 Nov 2022 09:36:14 -0800
From: David Woodhouse <dwmw2@...radead.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: "pbonzini@...hat.com" <pbonzini@...hat.com>,
"mhal@...x.co" <mhal@...x.co>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Durrant, Paul" <pdurrant@...zon.co.uk>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Kaya, Metin" <metikaya@...zon.co.uk>
Subject: Re: [EXTERNAL][PATCH 03/16] KVM: x86: set gfn-to-pfn cache length
consistently with VM word size
On Mon, 2022-11-14 at 16:33 +0000, Sean Christopherson wrote:
> On Mon, Nov 14, 2022, Woodhouse, David wrote:
> > Most other data structures, including the pvclock info (both Xen and
> > native KVM), could potentially cross page boundaries. And isn't that
> > also true for things that we'd want to use the GPC for in nesting?
>
> Off the top of my head, no. Except for MSR and I/O permission bitmaps, which
> are >4KiB, things that are referenced by physical address are <=4KiB and must be
> naturally aligned. nVMX does temporarily map L1's MSR bitmap, but that could be
> split into two separate mappings if necessary.
>
> > For the runstate info I suggested reverting commit a795cd43c5b5 but
> > that doesn't actually work because it still has the same problem. Even
> > the gfn-to-hva cache still only really works for a single page, and
> > things like kvm_write_guest_offset_cached() will fall back to using
> > kvm_write_guest() in the case where it crosses a page boundary.
> >
> > I'm wondering if the better fix is to allow the GPC to map more than
> > one page.
>
> I agree that KVM should drop the "no page splits" restriction, but I don't think
> that would necessarily solve all KVM Xen issues. KVM still needs to precisely
> handle the "correct" struct size, e.g. if one of the structs is placed at the very
> end of the page such that the smaller compat version doesn't split a page but the
> 64-bit version does.
I think we can be more explicit that the guest 'long' mode shall never
change while anything is mapped. Xen automatically detects that a guest
is in 64-bit mode very early on, either in the first 'fill the
hypercall page' MSR write, or when setting HVM_PARAM_CALLBACK_IRQ to
configure interrupt routing.
Strictly speaking, a guest could put itself into 32-bit mode and set
HVM_PARAM_CALLBACK_IRQ *again*. Xen would only update the wallclock
time in that case, and makes no attempt to convert anything else. I
don't think we need to replicate that.
On kexec/soft reset it could go back to 32-bit mode, but the soft reset
unmaps everything so that's OK.
I looked at making the GPC handle multiple pages but can't see how to
sanely do it for the IOMEM case. vmap() takes a list of *pages* not
PFNs, and memremap_pages() is... overly complex.
But if we can reduce it to *just* the runstate info that potentially
needs >1 page, then we can probably handle that with using two GPC (or
maybe GHC) caches for it.
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5965 bytes)
Powered by blists - more mailing lists