lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba9e1a56-f769-01c1-607f-3630a62a1b5d@redhat.com>
Date:   Fri, 11 Feb 2022 02:07:48 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        vkuznets@...hat.com, mlevitsk@...hat.com, dmatlack@...gle.com
Subject: Re: [PATCH 08/12] KVM: MMU: do not consult levels when freeing roots

On 2/11/22 01:54, Sean Christopherson wrote:
> On Fri, Feb 11, 2022, Sean Christopherson wrote:
>> On Wed, Feb 09, 2022, Paolo Bonzini wrote:
>>> Right now, PGD caching requires a complicated dance of first computing
>>> the MMU role and passing it to __kvm_mmu_new_pgd, and then separately calling
>>
>> Nit, adding () after function names helps readers easily recognize when you're
>> taking about a specific function, e.g. as opposed to a concept or whatever.
>>
>>> kvm_init_mmu.
>>>
>>> Part of this is due to kvm_mmu_free_roots using mmu->root_level and
>>> mmu->shadow_root_level to distinguish whether the page table uses a single
>>> root or 4 PAE roots.  Because kvm_init_mmu can overwrite mmu->root_level,
>>> kvm_mmu_free_roots must be called before kvm_init_mmu.
>>>
>>> However, even after kvm_init_mmu there is a way to detect whether the page table
>>> has a single root or four, because the pae_root does not have an associated
>>> struct kvm_mmu_page.
>>
>> Suggest a reword on the final paragraph, because there's a discrepancy with the
>> code (which handles 0, 1, or 4 "roots", versus just "single or four").
>>
>>    However, even after kvm_init_mmu() there is a way to detect whether the
>>    page table may hold PAE roots, as root.hpa isn't backed by a shadow when
>>    it points at PAE roots.
>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
>>> ---
>>>   arch/x86/kvm/mmu/mmu.c | 10 ++++++----
>>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
>>> index 3c3f597ea00d..95d0fa0bb876 100644
>>> --- a/arch/x86/kvm/mmu/mmu.c
>>> +++ b/arch/x86/kvm/mmu/mmu.c
>>> @@ -3219,12 +3219,15 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>>>   	struct kvm *kvm = vcpu->kvm;
>>>   	int i;
>>>   	LIST_HEAD(invalid_list);
>>> -	bool free_active_root = roots_to_free & KVM_MMU_ROOT_CURRENT;
>>> +	bool free_active_root;
>>>   
>>>   	BUILD_BUG_ON(KVM_MMU_NUM_PREV_ROOTS >= BITS_PER_LONG);
>>>   
>>>   	/* Before acquiring the MMU lock, see if we need to do any real work. */
>>> -	if (!(free_active_root && VALID_PAGE(mmu->root.hpa))) {
>>> +	free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT)
>>> +		&& VALID_PAGE(mmu->root.hpa);
>>
>> 	free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT) &&
>> 			   VALID_PAGE(mmu->root.hpa);
>>
>> Isn't this a separate bug fix?  E.g. call kvm_mmu_unload() without a valid current
>> root, but with valid previous roots?  In which case we'd try to free garbage, no?

mmu_free_root_page checks VALID_PAGE(*root_hpa).  If that's what you 
meant, then it wouldn't be a preexisting bug (and I think it'd be a 
fairly common case).

>>> +
>>> +	if (!free_active_root) {
>>>   		for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
>>>   			if ((roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) &&
>>>   			    VALID_PAGE(mmu->prev_roots[i].hpa))
>>> @@ -3242,8 +3245,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>>>   					   &invalid_list);
>>>   
>>>   	if (free_active_root) {
>>> -		if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
>>> -		    (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
>>> +		if (to_shadow_page(mmu->root.hpa)) {
>>>   			mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
>>>   		} else if (mmu->pae_root) {
> 
> Gah, this is technically wrong.  It shouldn't truly matter, but it's wrong.  root.hpa
> will not be backed by shadow page if the root is pml4_root or pml5_root, in which
> case freeing the PAE root is wrong.  They should obviously be invalid already, but
> it's a little confusing because KVM wanders down a path that may not be relevant
> to the current mode.

pml4_root and pml5_root are dummy, and the first "real" level of page 
tables is stored in pae_root for that case too, so I think that should DTRT.

That's why I also disliked the shadow_root_level/root_level/direct 
check: even though there's half a dozen of cases involved, they all boil 
down to either 4 pae_roots or a single root with a backing kvm_mmu_page.

It's even more obscure to check shadow_root_level/root_level/direct in 
fast_pgd_switch, where it's pretty obvious that you cannot cache 4 
pae_roots in a single (hpa, pgd) pair...

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ