[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a8183dcd-f82e-49b3-b5b4-96e5363e060a@redhat.com>
Date: Thu, 10 Jul 2025 15:24:22 +1000
From: Gavin Shan <gshan@...hat.com>
To: Steven Price <steven.price@....com>, kvm@...r.kernel.org,
kvmarm@...ts.linux.dev
Cc: Catalin Marinas <catalin.marinas@....com>, Marc Zyngier <maz@...nel.org>,
Will Deacon <will@...nel.org>, James Morse <james.morse@....com>,
Oliver Upton <oliver.upton@...ux.dev>,
Suzuki K Poulose <suzuki.poulose@....com>, Zenghui Yu
<yuzenghui@...wei.com>, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, Joey Gouly <joey.gouly@....com>,
Alexandru Elisei <alexandru.elisei@....com>,
Christoffer Dall <christoffer.dall@....com>, Fuad Tabba <tabba@...gle.com>,
linux-coco@...ts.linux.dev,
Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>,
Shanker Donthineni <sdonthineni@...dia.com>, Alper Gun
<alpergun@...gle.com>, "Aneesh Kumar K . V" <aneesh.kumar@...nel.org>,
Emi Kisanuki <fj0570is@...itsu.com>
Subject: Re: [PATCH v9 15/43] arm64: RME: Allow VMM to set RIPAS
Hi Steve,
On 7/10/25 12:42 AM, Steven Price wrote:
> On 02/07/2025 01:37, Gavin Shan wrote:
>> On 6/11/25 8:48 PM, Steven Price wrote:
>>> Each page within the protected region of the realm guest can be marked
>>> as either RAM or EMPTY. Allow the VMM to control this before the guest
>>> has started and provide the equivalent functions to change this (with
>>> the guest's approval) at runtime.
>>>
>>> When transitioning from RIPAS RAM (1) to RIPAS EMPTY (0) the memory is
>>> unmapped from the guest and undelegated allowing the memory to be reused
>>> by the host. When transitioning to RIPAS RAM the actual population of
>>> the leaf RTTs is done later on stage 2 fault, however it may be
>>> necessary to allocate additional RTTs to allow the RMM track the RIPAS
>>> for the requested range.
>>>
>>> When freeing a block mapping it is necessary to temporarily unfold the
>>> RTT which requires delegating an extra page to the RMM, this page can
>>> then be recovered once the contents of the block mapping have been
>>> freed.
>>>
>>> Signed-off-by: Steven Price <steven.price@....com>
>>> ---
>>> Changes from v8:
>>> * Propagate the 'may_block' flag to allow conditional calls to
>>> cond_resched_rwlock_write().
>>> * Introduce alloc_rtt() to wrap alloc_delegated_granule() and
>>> kvm_account_pgtable_pages() and use when allocating RTTs.
>>> * Code reorganisation to allow init_ipa_state and set_ipa_state to
>>> share a common ripas_change() function,
>>> * Other minor changes following review.
>>> Changes from v7:
>>> * Replace use of "only_shared" with the upstream "attr_filter" field
>>> of struct kvm_gfn_range.
>>> * Clean up the logic in alloc_delegated_granule() for when to call
>>> kvm_account_pgtable_pages().
>>> * Rename realm_destroy_protected_granule() to
>>> realm_destroy_private_granule() to match the naming elsewhere. Also
>>> fix the return codes in the function to be descriptive.
>>> * Several other minor changes to names/return codes.
>>> Changes from v6:
>>> * Split the code dealing with the guest triggering a RIPAS change into
>>> a separate patch, so this patch is purely for the VMM setting up the
>>> RIPAS before the guest first runs.
>>> * Drop the useless flags argument from alloc_delegated_granule().
>>> * Account RTTs allocated for a guest using kvm_account_pgtable_pages().
>>> * Deal with the RMM granule size potentially being smaller than the
>>> host's PAGE_SIZE. Although note alloc_delegated_granule() currently
>>> still allocates an entire host page for every RMM granule (so wasting
>>> memory when PAGE_SIZE>4k).
>>> Changes from v5:
>>> * Adapt to rebasing.
>>> * Introduce find_map_level()
>>> * Rename some functions to be clearer.
>>> * Drop the "spare page" functionality.
>>> Changes from v2:
>>> * {alloc,free}_delegated_page() moved from previous patch to this one.
>>> * alloc_delegated_page() now takes a gfp_t flags parameter.
>>> * Fix the reference counting of guestmem pages to avoid leaking memory.
>>> * Several misc code improvements and extra comments.
>>> ---
>>> arch/arm64/include/asm/kvm_rme.h | 6 +
>>> arch/arm64/kvm/mmu.c | 8 +-
>>> arch/arm64/kvm/rme.c | 447 +++++++++++++++++++++++++++++++
>>> 3 files changed, 458 insertions(+), 3 deletions(-)
>>>
>>
>> With below nitpicks addressed. The changes looks good to me.
>>
>> Reviewed-by: Gavin Shan <gshan@...hat.com>
>
> Thanks, most the nitpicks I agree - thanks for raising. Just one below I
> wanted to comment on...
>
> [...]
You're welcome.
>>> +
>>> +enum ripas_action {
>>> + RIPAS_INIT,
>>> + RIPAS_SET,
>>> +};
>>> +
>>> +static int ripas_change(struct kvm *kvm,
>>> + struct kvm_vcpu *vcpu,
>>> + unsigned long ipa,
>>> + unsigned long end,
>>> + enum ripas_action action,
>>> + unsigned long *top_ipa)
>>> +{
>>
>> The 'enum ripas_action' is used in limited scope, I would replace it
>> with a 'bool'
>> parameter to ripas_change(), something like below. If we plan to support
>> more actions
>> in future, then the 'enum ripas_action' makes sense to me.
>
> The v1.1 spec[1] adds RMI_RTT_SET_S2AP (set stage 2 access permission).
> So that adds a third option to the enum. I agree the enum is a little
> clunky but it allows extension and at least spells out the action which
> is occurring.
>
> The part I'm not especially happy with is the 'vcpu' argument which is
> not applicable to RIPAS_INIT but otherwise required (and in those cases
> could replace 'kvm'). But I couldn't come up with a better solution for
> that.
>
> [1] Available from:
> https://developer.arm.com/documentation/den0137/latest (following the
> small "here" link near the end).
>
Right, it's as I guessed. A enum looks good if we need to extend it
to cover the third case (RMI_RTT_SET_S2AP in RMMv1.1). Note that I just
started looking into RMMv1.1 implementation several days ago and didn't
have a good understanding on RMMv1.1 at present :-)
Thanks,
Gavin
> Thanks,
> Steve
>
>> static int ripas_change(struct kvm *kvm,
>> struct kvm_vcpu *vcpu,
>> unsigned long ipa,
>> unsigned long end,
>> bool set_ripas,
>> unsigned long *top_ipa)
>>
>>> + struct realm *realm = &kvm->arch.realm;
>>> + phys_addr_t rd_phys = virt_to_phys(realm->rd);
>>> + phys_addr_t rec_phys;
>>> + struct kvm_mmu_memory_cache *memcache = NULL;
>>> + int ret = 0;
>>> +
>>> + if (vcpu) {
>>> + rec_phys = virt_to_phys(vcpu->arch.rec.rec_page);
>>> + memcache = &vcpu->arch.mmu_page_cache;
>>> +
>>> + WARN_ON(action != RIPAS_SET);
>>> + } else {
>>> + WARN_ON(action != RIPAS_INIT);
>>> + }
>>> +
>>> + while (ipa < end) {
>>> + unsigned long next;
>>> +
>>> + switch (action) {
>>> + case RIPAS_INIT:
>>> + ret = rmi_rtt_init_ripas(rd_phys, ipa, end, &next);
>>> + break;
>>> + case RIPAS_SET:
>>> + ret = rmi_rtt_set_ripas(rd_phys, rec_phys, ipa, end,
>>> + &next);
>>> + break;
>>> + }
>>> +
>>
>> if 'enum ripas_action' is replaced by 'bool set_ripas' as above, this needs
>> twist either.
>>
>>> + switch (RMI_RETURN_STATUS(ret)) {
>>> + case RMI_SUCCESS:
>>> + ipa = next;
>>> + break;
>>> + case RMI_ERROR_RTT:
>>> + int err_level = RMI_RETURN_INDEX(ret);
>>> + int level = find_map_level(realm, ipa, end);
>>> +
>>> + if (err_level >= level)
>>> + return -EINVAL;
>>> +
>>> + ret = realm_create_rtt_levels(realm, ipa, err_level,
>>> + level, memcache);
>>> + if (ret)
>>> + return ret;
>>> + /* Retry with the RTT levels in place */
>>> + break;
>>> + default:
>>> + WARN_ON(1);
>>> + return -ENXIO;
>>> + }
>>> + }
>>> +
>>> + if (top_ipa)
>>> + *top_ipa = ipa;
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int realm_init_ipa_state(struct kvm *kvm,
>>> + unsigned long ipa,
>>> + unsigned long end)
>>> +{
>>> + return ripas_change(kvm, NULL, ipa, end, RIPAS_INIT, NULL);
>>> +}
>>> +
>>> +static int kvm_init_ipa_range_realm(struct kvm *kvm,
>>> + struct arm_rme_init_ripas *args)
>>> +{
>>> + gpa_t addr, end;
>>> +
>>> + addr = args->base;
>>> + end = addr + args->size;
>>> +
>>> + if (end < addr)
>>> + return -EINVAL;
>>> +
>>> + if (kvm_realm_state(kvm) != REALM_STATE_NEW)
>>> + return -EPERM;
>>> +
>>> + return realm_init_ipa_state(kvm, addr, end);
>>> +}
>>> +
>>> /* Protects access to rme_vmid_bitmap */
>>> static DEFINE_SPINLOCK(rme_vmid_lock);
>>> static unsigned long *rme_vmid_bitmap;
>>> @@ -441,6 +876,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct
>>> kvm_enable_cap *cap)
>>> case KVM_CAP_ARM_RME_CREATE_REALM:
>>> r = kvm_create_realm(kvm);
>>> break;
>>> + case KVM_CAP_ARM_RME_INIT_RIPAS_REALM: {
>>> + struct arm_rme_init_ripas args;
>>> + void __user *argp = u64_to_user_ptr(cap->args[1]);
>>> +
>>> + if (copy_from_user(&args, argp, sizeof(args))) {
>>> + r = -EFAULT;
>>> + break;
>>> + }
>>> +
>>> + r = kvm_init_ipa_range_realm(kvm, &args);
>>> + break;
>>> + }
>>> default:
>>> r = -EINVAL;
>>> break;
>>
>> Thanks,
>> Gavin
>>
>
Powered by blists - more mailing lists