linux-kernel - Re: [PATCH v5.5 26/30] KVM: Keep memslots in tree-based structures instead of array-based ones

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d1c648e4-5536-111d-a7bf-3644ac68c9f5@oracle.com>
Date:   Sat, 13 Nov 2021 16:22:48 +0100
From:   "Maciej S. Szmigiero" <maciej.szmigiero@...cle.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     James Morse <james.morse@....com>,
        Alexandru Elisei <alexandru.elisei@....com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Atish Patra <atish.patra@....com>,
        David Hildenbrand <david@...hat.com>,
        Cornelia Huck <cohuck@...hat.com>,
        Claudio Imbrenda <imbrenda@...ux.ibm.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        linux-mips@...r.kernel.org, kvm@...r.kernel.org,
        kvm-ppc@...r.kernel.org, kvm-riscv@...ts.infradead.org,
        linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Ben Gardon <bgardon@...gle.com>, Marc Zyngier <maz@...nel.org>,
        Huacai Chen <chenhuacai@...nel.org>,
        Aleksandar Markovic <aleksandar.qemu.devel@...il.com>,
        Paul Mackerras <paulus@...abs.org>,
        Anup Patel <anup.patel@....com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Janosch Frank <frankja@...ux.ibm.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH v5.5 26/30] KVM: Keep memslots in tree-based structures
 instead of array-based ones

On 12.11.2021 01:51, Sean Christopherson wrote:
> On Fri, Nov 12, 2021, Maciej S. Szmigiero wrote:
>> On 04.11.2021 01:25, Sean Christopherson wrote:
>>> -	/*
>>> -	 * Remove the old memslot from the hash list and interval tree, copying
>>> -	 * the node data would corrupt the structures.
>>> -	 */
>>> +	int as_id = kvm_memslots_get_as_id(old, new);
>>> +	struct kvm_memslots *slots = kvm_get_inactive_memslots(kvm, as_id);
>>> +	int idx = slots->node_idx;
>>> +
>>>    	if (old) {
>>> -		hash_del(&old->id_node);
>>> -		interval_tree_remove(&old->hva_node, &slots->hva_tree);
>>> +		hash_del(&old->id_node[idx]);
>>> +		interval_tree_remove(&old->hva_node[idx], &slots->hva_tree);
>>> -		if (!new)
>>> +		if ((long)old == atomic_long_read(&slots->last_used_slot))
>>> +			atomic_long_set(&slots->last_used_slot, (long)new);
>>
>> Open-coding cmpxchg() is way less readable than a direct call.
> 
> Doh, I meant to call this out and/or add a comment.
> 
> My objection to cmpxchg() is that it implies atomicity is required (the kernel's
> version adds the lock), which is very much not the case.  So this isn't strictly
> an open-coded version of cmpxchg().
> 
>> The open-coded version also compiles on x86 to multiple instructions with
>> a branch, instead of just a single instruction.
> 
> Yeah.  The lock can't be contended, so that part of cmpxchg is a non-issue.  But
> that's also why I don't love using cmpxchg.
> 
> I don't have a strong preference, I just got briefly confused by the atomicity part.

We can simply add a comment there to explain that the atomicity isn't actually
strictly required here - will do that.

>>> +static void kvm_invalidate_memslot(struct kvm *kvm,
>>> +				   struct kvm_memory_slot *old,
>>> +				   struct kvm_memory_slot *working_slot)
>>> +{
>>> +	/*
>>> +	 * Mark the current slot INVALID.  As with all memslot modifications,
>>> +	 * this must be done on an unreachable slot to avoid modifying the
>>> +	 * current slot in the active tree.
>>> +	 */
>>> +	kvm_copy_memslot(working_slot, old);
>>> +	working_slot->flags |= KVM_MEMSLOT_INVALID;
>>> +	kvm_replace_memslot(kvm, old, working_slot);
>>> +
>>> +	/*
>>> +	 * Activate the slot that is now marked INVALID, but don't propagate
>>> +	 * the slot to the now inactive slots. The slot is either going to be
>>> +	 * deleted or recreated as a new slot.
>>> +	 */
>>> +	kvm_swap_active_memslots(kvm, old->as_id);
>>> +
>>> +	/*
>>> +	 * From this point no new shadow pages pointing to a deleted, or moved,
>>> +	 * memslot will be created.  Validation of sp->gfn happens in:
>>> +	 *	- gfn_to_hva (kvm_read_guest, gfn_to_pfn)
>>> +	 *	- kvm_is_visible_gfn (mmu_check_root)
>>> +	 */
>>> +	kvm_arch_flush_shadow_memslot(kvm, old);
>>
>> This should flush the currently active slot (that is, "working_slot",
>> not "old") to not introduce a behavior change with respect to the existing
>> code.
>>
>> That's also what the previous version of this patch set did.
> 
> Eww.  I would much prefer to "fix" the existing code in a prep patch.  It shouldn't
> matter, but arch code really should not get passed an INVALID slot.
> 

I will add a separate patch that switches that kvm_arch_flush_shadow_memslot()
call to use a valid (old) memslot instead.

It is actually simpler to do it *after* the main patch series to not add
more dead code that next patches remove anyway.

Thanks,
Maciej