[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBAm3a6ovCQzB/1/@yzhao56-desk.sh.intel.com>
Date: Tue, 29 Apr 2025 09:09:49 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Paolo Bonzini <pbonzini@...hat.com>, <kvm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Michael Roth <michael.roth@....com>
Subject: Re: [PATCH] KVM: x86/mmu: Prevent installing hugepages when mem
attributes are changing
On Mon, Apr 28, 2025 at 07:50:21AM -0700, Sean Christopherson wrote:
> On Mon, Apr 28, 2025, Yan Zhao wrote:
> > On Fri, Apr 25, 2025 at 05:10:56PM -0700, Sean Christopherson wrote:
> > > @@ -7686,6 +7707,37 @@ bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm,
> > > if (WARN_ON_ONCE(!kvm_arch_has_private_mem(kvm)))
> > > return false;
> > >
> > > + if (WARN_ON_ONCE(range->end <= range->start))
> > > + return false;
> > > +
> > > + /*
> > > + * If the head and tail pages of the range currently allow a hugepage,
> > > + * i.e. reside fully in the slot and don't have mixed attributes, then
> > > + * add each corresponding hugepage range to the ongoing invalidation,
> > > + * e.g. to prevent KVM from creating a hugepage in response to a fault
> > > + * for a gfn whose attributes aren't changing. Note, only the range
> > > + * of gfns whose attributes are being modified needs to be explicitly
> > > + * unmapped, as that will unmap any existing hugepages.
> > > + */
> > > + for (level = PG_LEVEL_2M; level <= KVM_MAX_HUGEPAGE_LEVEL; level++) {
> > > + gfn_t start = gfn_round_for_level(range->start, level);
> > > + gfn_t end = gfn_round_for_level(range->end - 1, level);
> > > + gfn_t nr_pages = KVM_PAGES_PER_HPAGE(level);
> > > +
> > > + if ((start != range->start || start + nr_pages > range->end) &&
> > > + start >= slot->base_gfn &&
> > > + start + nr_pages <= slot->base_gfn + slot->npages &&
> > > + !hugepage_test_mixed(slot, start, level))
> > Instead of checking mixed flag in disallow_lpage, could we check disallow_lpage
> > directly?
> >
> > So, if mixed flag is not set but disallow_lpage is 1, there's no need to update
> > the invalidate range.
> >
> > > + kvm_mmu_invalidate_range_add(kvm, start, start + nr_pages);
> > > +
> > > + if (end == start)
> > > + continue;
> > > +
> > > + if ((end + nr_pages) <= (slot->base_gfn + slot->npages) &&
> > > + !hugepage_test_mixed(slot, end, level))
> > if ((end + nr_pages > range->end) &&
> > ((end + nr_pages) <= (slot->base_gfn + slot->npages)) &&
> > !lpage_info_slot(gfn, slot, level)->disallow_lpage)
> >
> > ?
>
> No, disallow_lpage is used by write-tracking and shadow paging to prevent creating
> huge pages for a write-protected gfn. mmu_lock is dropped after the pre_set_range
> call to kvm_handle_gfn_range(), and so disallow_lpage could go to zero if the last
> shadow page for the affected range is zapped. In practice, KVM isn't going to be
That's a good point. I missed it.
> doing write-tracking or shadow paging for CoCo VMs, so there's no missed optimization
> on that front.
>
> And if disallow_lpage is non-zero due to a misaligned memslot base/size, then the
> start/end checks will skip this level anyways.
If the gfn and userspace address are not aligned wrt each other at a certain
level, the disallow_lpage for that level is set to 1 for the entire slot.
This is often the case at the 1G level.
But as kvm_vm_set_mem_attributes() holds write mmu_lock for most of the time,
preventing fault over a larger range for another short period looks no harm.
Powered by blists - more mailing lists