[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a89b9047-7d0c-41c7-93a3-f31e55315ebd@arm.com>
Date: Wed, 1 Oct 2025 16:54:21 +0100
From: Steven Price <steven.price@....com>
To: Marc Zyngier <maz@...nel.org>
Cc: kvm@...r.kernel.org, kvmarm@...ts.linux.dev,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
James Morse <james.morse@....com>, Oliver Upton <oliver.upton@...ux.dev>,
Suzuki K Poulose <suzuki.poulose@....com>, Zenghui Yu
<yuzenghui@...wei.com>, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, Joey Gouly <joey.gouly@....com>,
Alexandru Elisei <alexandru.elisei@....com>,
Christoffer Dall <christoffer.dall@....com>, Fuad Tabba <tabba@...gle.com>,
linux-coco@...ts.linux.dev,
Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>,
Gavin Shan <gshan@...hat.com>, Shanker Donthineni <sdonthineni@...dia.com>,
Alper Gun <alpergun@...gle.com>, "Aneesh Kumar K . V"
<aneesh.kumar@...nel.org>, Emi Kisanuki <fj0570is@...itsu.com>,
Vishal Annapurve <vannapurve@...gle.com>
Subject: Re: [PATCH v10 09/43] KVM: arm64: Allow passing machine type in KVM
creation
On 01/10/2025 14:50, Marc Zyngier wrote:
> On Wed, 20 Aug 2025 15:55:29 +0100,
> Steven Price <steven.price@....com> wrote:
>>
>> Previously machine type was used purely for specifying the physical
>> address size of the guest. Reserve the higher bits to specify an ARM
>> specific machine type and declare a new type 'KVM_VM_TYPE_ARM_REALM'
>> used to create a realm guest.
>>
>> Reviewed-by: Suzuki K Poulose <suzuki.poulose@....com>
>> Reviewed-by: Gavin Shan <gshan@...hat.com>
>> Signed-off-by: Steven Price <steven.price@....com>
>> ---
>> Changes since v9:
>> * Explictly set realm.state to REALM_STATE_NONE rather than rely on the
>> zeroing of the structure.
>> Changes since v7:
>> * Add some documentation explaining the new machine type.
>> Changes since v6:
>> * Make the check for kvm_rme_is_available more visible and report an
>> error code of -EPERM (instead of -EINVAL) to make it explicit that
>> the kernel supports RME, but the platform doesn't.
>> ---
>> Documentation/virt/kvm/api.rst | 16 ++++++++++++++--
>> arch/arm64/kvm/arm.c | 16 ++++++++++++++++
>> arch/arm64/kvm/mmu.c | 3 ---
>> include/uapi/linux/kvm.h | 19 +++++++++++++++----
>> 4 files changed, 45 insertions(+), 9 deletions(-)
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index 69c0a9eba6c5..fad3191df311 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -181,8 +181,20 @@ flag KVM_VM_MIPS_VZ.
>> ARM64:
>> ^^^^^^
>>
>> -On arm64, the physical address size for a VM (IPA Size limit) is limited
>> -to 40bits by default. The limit can be configured if the host supports the
>> +On arm64, the machine type identifier is used to encode a type and the
>> +physical address size for the VM. The lower byte (bits[7-0]) encode the
>> +address size and the upper bits[11-8] encode a machine type. The machine
>> +types that might be available are:
>> +
>> + ====================== ============================================
>> + KVM_VM_TYPE_ARM_NORMAL A standard VM
>> + KVM_VM_TYPE_ARM_REALM A "Realm" VM using the Arm Confidential
>> + Compute extensions, the VM's memory is
>> + protected from the host.
>> + ====================== ============================================
>> +
>> +The physical address size for a VM (IPA Size limit) is limited to 40bits
>> +by default. The limit can be configured if the host supports the
>> extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
>> KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
>> identifier, where IPA_Bits is the maximum width of any physical
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 8c0e9ec34f0a..5b582b705eee 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -172,6 +172,22 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>> mutex_unlock(&kvm->lock);
>> #endif
>>
>> + if (type & ~(KVM_VM_TYPE_ARM_MASK | KVM_VM_TYPE_ARM_IPA_SIZE_MASK))
>> + return -EINVAL;
>> +
>> + switch (type & KVM_VM_TYPE_ARM_MASK) {
>> + case KVM_VM_TYPE_ARM_NORMAL:
>> + break;
>> + case KVM_VM_TYPE_ARM_REALM:
>> + if (!static_branch_unlikely(&kvm_rme_is_available))
>> + return -EPERM;
>
> EPERM? That's rather odd. You can only do that if the CCA capability
> has been advertised. So asking for it when you can find that it is not
> there deserves more of an EINVAL response.
Ack
>> + WRITE_ONCE(kvm->arch.realm.state, REALM_STATE_NONE);
>> + kvm->arch.is_realm = true;
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> kvm_init_nested(kvm);
>>
>> ret = kvm_share_hyp(kvm, kvm + 1);
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index de10dbde4761..130f28dfb3cb 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -881,9 +881,6 @@ static int kvm_init_ipa_range(struct kvm *kvm,
>> if (kvm_is_realm(kvm))
>> kvm_ipa_limit = kvm_realm_ipa_limit();
>>
>> - if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
>> - return -EINVAL;
>> -
>
> How about the *other* bits? You need to be able to detect that
> userspace is using the reserved bits and error out.
That check is now further up in kvm_arch_init_vm():
>> + if (type & ~(KVM_VM_TYPE_ARM_MASK | KVM_VM_TYPE_ARM_IPA_SIZE_MASK))
>> + return -EINVAL;
(along with the default branch in the switch handling unknown 'types')
>> phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
>> if (is_protected_kvm_enabled()) {
>> phys_shift = kvm_ipa_limit;
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 7dafb443368a..b70ecee918de 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -672,14 +672,25 @@ struct kvm_enable_cap {
>> #define KVM_S390_SIE_PAGE_OFFSET 1
>>
>> /*
>> - * On arm64, machine type can be used to request the physical
>> - * address size for the VM. Bits[7-0] are reserved for the guest
>> - * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
>> - * value 0 implies the default IPA size, 40bits.
>> + * On arm64, machine type can be used to request both the machine type and
>> + * the physical address size for the VM.
>> + *
>> + * Bits[11-8] are reserved for the ARM specific machine type.
>> + *
>> + * Bits[7-0] are reserved for the guest PA size shift (i.e, log2(PA_Size)).
>> + * For backward compatibility, value 0 implies the default IPA size, 40bits.
>> */
>> +#define KVM_VM_TYPE_ARM_SHIFT 8
>> +#define KVM_VM_TYPE_ARM_MASK (0xfULL << KVM_VM_TYPE_ARM_SHIFT)
>> +#define KVM_VM_TYPE_ARM(_type) \
>> + (((_type) << KVM_VM_TYPE_ARM_SHIFT) & KVM_VM_TYPE_ARM_MASK)
>> +#define KVM_VM_TYPE_ARM_NORMAL KVM_VM_TYPE_ARM(0)
>> +#define KVM_VM_TYPE_ARM_REALM KVM_VM_TYPE_ARM(1)
>
> Why can't that be just "PROTECTED", using bit 31, just like pKVM? I
> don't see the point in deviating from an established practice.
As far as I'm aware this pKVM practise isn't upstream. We also need to
distinguish between pKVM and CCA - it's entirely possible that both
might coexist on the same platform.
This is just extending a proposal that Will posted years ago[1] for pkvm
with room for more types - that used bit 8 for a
KVM_VM_TYPE_ARM_PROTECTED flag. I agree there might be an argument for
the realm type being 2 if we want to keep 1 for pKVM. But it looks like
downstream pKVM has done something different and incompatible anyway.
Thanks,
Steve
[1]
https://lists.infradead.org/pipermail/linux-arm-kernel/2022-May/744865.html
Powered by blists - more mailing lists