[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d4c5d59f-5526-2c4f-236e-ca1536e762f6@redhat.com>
Date: Wed, 9 Feb 2022 13:34:20 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: David Matlack <dmatlack@...gle.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
seanjc@...gle.com, vkuznets@...hat.com
Subject: Re: [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional
paging is initialized
On 2/4/22 20:18, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:57:01AM -0500, Paolo Bonzini wrote:
>> __kvm_mmu_new_pgd looks at the MMU's root_level and shadow_root_level
>> via fast_pgd_switch.
>
> Those checks are just for performance correct (to skip iterating through
> the list of roots)?
>
> Either way, it's probably worth including a Fixes tag below.
It's actually a much bigger mess---though it's working as intended,
except that in some cases the caching is suboptimal.
Basically, you have to call __kvm_mmu_new_pgd *before* kvm_init_mmu
because of how fast_pgd_switch takes care of stashing the current root
in the cache. A PAE root is never placed in the cache exactly because
mmu->root_level and mmu->shadow_root_level hold the value for the
_current_ root. In that case, fast_pgd_switch does not look for a
cached PGD (even if according to the role it could be there).
__kvm_mmu_new_pgd then frees the current root using kvm_mmu_free_roots.
kvm_mmu_free_roots *also* uses mmu->root_level and
mmu->shadow_root_level to distinguish whether the page table uses a
single root or 4 PAE roots. Because kvm_init_mmu can overwrite
muu->root_level, kvm_mmu_free_roots must also be called before kvm_init_mmu.
I do wonder if the use of mmu->shadow_root_level was intentional (it
certainly escaped me when first reviewing fast PGD switching), but
fortunately none of this is necessary. PAE roots can be identified from
!to_shadow_page(root_hpa), so the better fix is to do that:
- in fast_pgd_switch/cached_root_available, you need to account for two
possible transformations of the cache: either the old entry becomes the
MRU of the prev_roots as in the current code, or the old entry cannot be
cached. This is 20-30 more lines of code.
- in kvm_mmu_free_roots, just use to_shadow_page instead of
mmu->root_level and mmu->shadow_root_level.
Once this is in place, the original bug is fixed simply by calling
kvm_mmu_new_pgd after kvm_init_mmu. kvm_mmu_reset_context is not doing
now to avoid having to figure out the new role, but it can do that
easily after the above change.
I'll keep this cpu_role refactoring around, but strictly speaking it
becomes a separate change than the optimization.
Paolo
Powered by blists - more mailing lists