linux-kernel - Re: [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d4c5d59f-5526-2c4f-236e-ca1536e762f6@redhat.com>
Date:   Wed, 9 Feb 2022 13:34:20 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     David Matlack <dmatlack@...gle.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        seanjc@...gle.com, vkuznets@...hat.com
Subject: Re: [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional
 paging is initialized

On 2/4/22 20:18, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:57:01AM -0500, Paolo Bonzini wrote:
>> __kvm_mmu_new_pgd looks at the MMU's root_level and shadow_root_level
>> via fast_pgd_switch.
> 
> Those checks are just for performance correct (to skip iterating through
> the list of roots)?
> 
> Either way, it's probably worth including a Fixes tag below.

It's actually a much bigger mess---though it's working as intended, 
except that in some cases the caching is suboptimal.

Basically, you have to call __kvm_mmu_new_pgd *before* kvm_init_mmu 
because of how fast_pgd_switch takes care of stashing the current root 
in the cache.   A PAE root is never placed in the cache exactly because 
mmu->root_level and mmu->shadow_root_level hold the value for the 
_current_ root.  In that case, fast_pgd_switch does not look for a 
cached PGD (even if according to the role it could be there).

__kvm_mmu_new_pgd then frees the current root using kvm_mmu_free_roots. 
  kvm_mmu_free_roots *also* uses mmu->root_level and 
mmu->shadow_root_level to distinguish whether the page table uses a 
single root or 4 PAE roots.  Because kvm_init_mmu can overwrite 
muu->root_level, kvm_mmu_free_roots must also be called before kvm_init_mmu.

I do wonder if the use of mmu->shadow_root_level was intentional (it 
certainly escaped me when first reviewing fast PGD switching), but 
fortunately none of this is necessary.  PAE roots can be identified from 
!to_shadow_page(root_hpa), so the better fix is to do that:

- in fast_pgd_switch/cached_root_available, you need to account for two 
possible transformations of the cache: either the old entry becomes the 
MRU of the prev_roots as in the current code, or the old entry cannot be 
cached.  This is 20-30 more lines of code.

- in kvm_mmu_free_roots, just use to_shadow_page instead of 
mmu->root_level and mmu->shadow_root_level.

Once this is in place, the original bug is fixed simply by calling 
kvm_mmu_new_pgd after kvm_init_mmu.  kvm_mmu_reset_context is not doing 
now to avoid having to figure out the new role, but it can do that 
easily after the above change.

I'll keep this cpu_role refactoring around, but strictly speaking it 
becomes a separate change than the optimization.

Paolo