[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <acc35587-1fbe-3101-d3cc-86327ebb5837@redhat.com>
Date: Mon, 24 May 2021 14:12:58 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Keqian Zhu <zhukeqian1@...wei.com>, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, Sean Christopherson <seanjc@...gle.com>,
Ben Gardon <bgardon@...gle.com>
Cc: wanghaibin.wang@...wei.com
Subject: Re: [PATCH v3 2/2] KVM: x86: Not wr-protect huge page with
init_all_set dirty log
On 29/04/21 05:41, Keqian Zhu wrote:
> Currently during start dirty logging, if we're with init-all-set,
> we write protect huge pages and leave normal pages untouched, for
> that we can enable dirty logging for these pages lazily.
>
> Actually enable dirty logging lazily for huge pages is feasible
> too, which not only reduces the time of start dirty logging, also
> greatly reduces side-effect on guest when there is high dirty rate.
>
> Signed-off-by: Keqian Zhu <zhukeqian1@...wei.com>
> ---
> arch/x86/kvm/mmu/mmu.c | 29 +++++++++++++++++++++++++----
> arch/x86/kvm/x86.c | 37 ++++++++++---------------------------
> 2 files changed, 35 insertions(+), 31 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 2ce5bc2ea46d..f52c7ceafb72 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1188,8 +1188,7 @@ static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
> * @gfn_offset: start of the BITS_PER_LONG pages we care about
> * @mask: indicates which pages we should protect
> *
> - * Used when we do not need to care about huge page mappings: e.g. during dirty
> - * logging we do not have any such mappings.
> + * Used when we do not need to care about huge page mappings.
> */
> static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> struct kvm_memory_slot *slot,
> @@ -1246,13 +1245,35 @@ static void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
> * It calls kvm_mmu_write_protect_pt_masked to write protect selected pages to
> * enable dirty logging for them.
> *
> - * Used when we do not need to care about huge page mappings: e.g. during dirty
> - * logging we do not have any such mappings.
> + * We need to care about huge page mappings: e.g. during dirty logging we may
> + * have any such mappings.
> */
> void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
> struct kvm_memory_slot *slot,
> gfn_t gfn_offset, unsigned long mask)
> {
> + /*
> + * Huge pages are NOT write protected when we start dirty log with
> + * init-all-set, so we must write protect them at here.
> + *
> + * The gfn_offset is guaranteed to be aligned to 64, but the base_gfn
> + * of memslot has no such restriction, so the range can cross two large
> + * pages.
> + */
> + if (kvm_dirty_log_manual_protect_and_init_set(kvm)) {
> + gfn_t start = slot->base_gfn + gfn_offset + __ffs(mask);
> + gfn_t end = slot->base_gfn + gfn_offset + __fls(mask);
> +
> + kvm_mmu_slot_gfn_write_protect(kvm, slot, start, PG_LEVEL_2M);
> +
> + /* Cross two large pages? */
> + if (ALIGN(start << PAGE_SHIFT, PMD_SIZE) !=
> + ALIGN(end << PAGE_SHIFT, PMD_SIZE))
> + kvm_mmu_slot_gfn_write_protect(kvm, slot, end,
> + PG_LEVEL_2M);
> + }
> +
> + /* Then we can handle the PT level pages */
> if (kvm_x86_ops.cpu_dirty_log_size)
> kvm_mmu_clear_dirty_pt_masked(kvm, slot, gfn_offset, mask);
> else
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index eca63625aee4..dfd676ffa7da 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10888,36 +10888,19 @@ static void kvm_mmu_slot_apply_flags(struct kvm *kvm,
> */
> kvm_mmu_zap_collapsible_sptes(kvm, new);
> } else {
> - /* By default, write-protect everything to log writes. */
> - int level = PG_LEVEL_4K;
> + /*
> + * If we're with initial-all-set, we don't need to write protect
> + * any page because they're reported as dirty already.
> + */
> + if (kvm_dirty_log_manual_protect_and_init_set(kvm))
> + return;
>
> if (kvm_x86_ops.cpu_dirty_log_size) {
> - /*
> - * Clear all dirty bits, unless pages are treated as
> - * dirty from the get-go.
> - */
> - if (!kvm_dirty_log_manual_protect_and_init_set(kvm))
> - kvm_mmu_slot_leaf_clear_dirty(kvm, new);
> -
> - /*
> - * Write-protect large pages on write so that dirty
> - * logging happens at 4k granularity. No need to
> - * write-protect small SPTEs since write accesses are
> - * logged by the CPU via dirty bits.
> - */
> - level = PG_LEVEL_2M;
> - } else if (kvm_dirty_log_manual_protect_and_init_set(kvm)) {
> - /*
> - * If we're with initial-all-set, we don't need
> - * to write protect any small page because
> - * they're reported as dirty already. However
> - * we still need to write-protect huge pages
> - * so that the page split can happen lazily on
> - * the first write to the huge page.
> - */
> - level = PG_LEVEL_2M;
> + kvm_mmu_slot_leaf_clear_dirty(kvm, new);
> + kvm_mmu_slot_remove_write_access(kvm, new, PG_LEVEL_2M);
> + } else {
> + kvm_mmu_slot_remove_write_access(kvm, new, PG_LEVEL_4K);
> }
> - kvm_mmu_slot_remove_write_access(kvm, new, level);
> }
> }
>
>
Queued (with a few adjustments to the comments and commit messages), thanks.
Paolo
Powered by blists - more mailing lists