lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1b3caaf-636f-48e6-90e6-0bb650753748@arm.com>
Date: Tue, 20 May 2025 15:48:01 +0100
From: Suzuki K Poulose <suzuki.poulose@....com>
To: Steven Price <steven.price@....com>, Gavin Shan <gshan@...hat.com>,
 kvm@...r.kernel.org, kvmarm@...ts.linux.dev
Cc: Catalin Marinas <catalin.marinas@....com>, Marc Zyngier <maz@...nel.org>,
 Will Deacon <will@...nel.org>, James Morse <james.morse@....com>,
 Oliver Upton <oliver.upton@...ux.dev>, Zenghui Yu <yuzenghui@...wei.com>,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 Joey Gouly <joey.gouly@....com>, Alexandru Elisei
 <alexandru.elisei@....com>, Christoffer Dall <christoffer.dall@....com>,
 Fuad Tabba <tabba@...gle.com>, linux-coco@...ts.linux.dev,
 Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>,
 Shanker Donthineni <sdonthineni@...dia.com>, Alper Gun
 <alpergun@...gle.com>, "Aneesh Kumar K . V" <aneesh.kumar@...nel.org>
Subject: Re: [PATCH v8 20/43] arm64: RME: Runtime faulting of memory

On 16/05/2025 16:33, Steven Price wrote:
> On 01/05/2025 01:16, Gavin Shan wrote:
>> On 4/16/25 11:41 PM, Steven Price wrote:
>>> At runtime if the realm guest accesses memory which hasn't yet been
>>> mapped then KVM needs to either populate the region or fault the guest.
>>>
>>> For memory in the lower (protected) region of IPA a fresh page is
>>> provided to the RMM which will zero the contents. For memory in the
>>> upper (shared) region of IPA, the memory from the memslot is mapped
>>> into the realm VM non secure.
>>>
>>> Signed-off-by: Steven Price <steven.price@....com>
>>> ---
>>> Changes since v7:
>>>    * Remove redundant WARN_ONs for realm_create_rtt_levels() - it will
>>>      internally WARN when necessary.
>>> Changes since v6:
>>>    * Handle PAGE_SIZE being larger than RMM granule size.
>>>    * Some minor renaming following review comments.
>>> Changes since v5:
>>>    * Reduce use of struct page in preparation for supporting the RMM
>>>      having a different page size to the host.
>>>    * Handle a race when delegating a page where another CPU has faulted on
>>>      a the same page (and already delegated the physical page) but not yet
>>>      mapped it. In this case simply return to the guest to either use the
>>>      mapping from the other CPU (or refault if the race is lost).
>>>    * The changes to populate_par_region() are moved into the previous
>>>      patch where they belong.
>>> Changes since v4:
>>>    * Code cleanup following review feedback.
>>>    * Drop the PTE_SHARED bit when creating unprotected page table entries.
>>>      This is now set by the RMM and the host has no control of it and the
>>>      spec requires the bit to be set to zero.
>>> Changes since v2:
>>>    * Avoid leaking memory if failing to map it in the realm.
>>>    * Correctly mask RTT based on LPA2 flag (see rtt_get_phys()).
>>>    * Adapt to changes in previous patches.
>>> ---
>>>    arch/arm64/include/asm/kvm_emulate.h |  10 ++
>>>    arch/arm64/include/asm/kvm_rme.h     |  10 ++
>>>    arch/arm64/kvm/mmu.c                 | 127 ++++++++++++++++++-
>>>    arch/arm64/kvm/rme.c                 | 180 +++++++++++++++++++++++++++
>>>    4 files changed, 321 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/
>>> include/asm/kvm_emulate.h
>>> index c803c8188d9c..def439d6d732 100644
>>> --- a/arch/arm64/include/asm/kvm_emulate.h
>>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>>> @@ -704,6 +704,16 @@ static inline bool kvm_realm_is_created(struct
>>> kvm *kvm)
>>>        return kvm_is_realm(kvm) && kvm_realm_state(kvm) !=
>>> REALM_STATE_NONE;
>>>    }
>>>    +static inline gpa_t kvm_gpa_from_fault(struct kvm *kvm, phys_addr_t
>>> ipa)
>>> +{
>>> +    if (kvm_is_realm(kvm)) {
>>> +        struct realm *realm = &kvm->arch.realm;
>>> +
>>> +        return ipa & ~BIT(realm->ia_bits - 1);
>>> +    }
>>> +    return ipa;
>>> +}
>>> +
>>>    static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
>>>    {
>>>        if (static_branch_unlikely(&kvm_rme_is_available))
>>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/
>>> asm/kvm_rme.h
>>> index d86051ef0c5c..47aa6362c6c9 100644
>>> --- a/arch/arm64/include/asm/kvm_rme.h
>>> +++ b/arch/arm64/include/asm/kvm_rme.h
>>> @@ -108,6 +108,16 @@ void kvm_realm_unmap_range(struct kvm *kvm,
>>>                   unsigned long ipa,
>>>                   unsigned long size,
>>>                   bool unmap_private);
>>> +int realm_map_protected(struct realm *realm,
>>> +            unsigned long base_ipa,
>>> +            kvm_pfn_t pfn,
>>> +            unsigned long size,
>>> +            struct kvm_mmu_memory_cache *memcache);
>>> +int realm_map_non_secure(struct realm *realm,
>>> +             unsigned long ipa,
>>> +             kvm_pfn_t pfn,
>>> +             unsigned long size,
>>> +             struct kvm_mmu_memory_cache *memcache);
>>>      static inline bool kvm_realm_is_private_address(struct realm *realm,
>>>                            unsigned long addr)
>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>>> index 71c04259e39f..02b66ee35426 100644
>>> --- a/arch/arm64/kvm/mmu.c
>>> +++ b/arch/arm64/kvm/mmu.c
>>> @@ -338,8 +338,13 @@ static void __unmap_stage2_range(struct
>>> kvm_s2_mmu *mmu, phys_addr_t start, u64
>>>          lockdep_assert_held_write(&kvm->mmu_lock);
>>>        WARN_ON(size & ~PAGE_MASK);
>>> -    WARN_ON(stage2_apply_range(mmu, start, end,
>>> KVM_PGT_FN(kvm_pgtable_stage2_unmap),
>>> -                   may_block));
>>> +
>>> +    if (kvm_is_realm(kvm))
>>> +        kvm_realm_unmap_range(kvm, start, size, !only_shared);
>>> +    else
>>> +        WARN_ON(stage2_apply_range(mmu, start, end,
>>> +                       KVM_PGT_FN(kvm_pgtable_stage2_unmap),
>>> +                       may_block));
>>>    }
>>>    
>>
>> As spotted previsouly, the parameter @may_block isn't handled by
>> kvm_realm_unmap_range().
> 
> Ack.
> 
>>>    void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start,
>>> @@ -359,7 +364,10 @@ static void stage2_flush_memslot(struct kvm *kvm,
>>>        phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
>>>        phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
>>>    -    kvm_stage2_flush_range(&kvm->arch.mmu, addr, end);
>>> +    if (kvm_is_realm(kvm))
>>> +        kvm_realm_unmap_range(kvm, addr, end - addr, false);
>>> +    else
>>> +        kvm_stage2_flush_range(&kvm->arch.mmu, addr, end);
>>>    }
>>>      /**
>>> @@ -1053,6 +1061,10 @@ void stage2_unmap_vm(struct kvm *kvm)
>>>        struct kvm_memory_slot *memslot;
>>>        int idx, bkt;
>>>    +    /* For realms this is handled by the RMM so nothing to do here */
>>> +    if (kvm_is_realm(kvm))
>>> +        return;
>>> +
>>>        idx = srcu_read_lock(&kvm->srcu);
>>>        mmap_read_lock(current->mm);
>>>        write_lock(&kvm->mmu_lock);
>>> @@ -1078,6 +1090,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>>>        if (kvm_is_realm(kvm) &&
>>>            (kvm_realm_state(kvm) != REALM_STATE_DEAD &&
>>>             kvm_realm_state(kvm) != REALM_STATE_NONE)) {
>>> +        kvm_stage2_unmap_range(mmu, 0, (~0ULL) & PAGE_MASK, false);
>>>            write_unlock(&kvm->mmu_lock);
>>>            kvm_realm_destroy_rtts(kvm, pgt->ia_bits);
>>
>> (~0ULL & PAGE_MASK) wouldn't be a problem since the range will be
>> limited to
>> [0, BIT(realm->ia_bits) - 1] in kvm_realm_unmap_range(). I think it's
>> reasonable
>> to pass the maximal size here, something like:
>>
>>          kvm_stage2_unmap_range(mmu, 0, BIT(realm->ia_bits - 1), false);

I think this must be, given the end is excluding:
	   kvm_stage2_unmap_range(mmu, 0, BIT(realm->ia_bits), false);

BIT(realm->ia_bits - 1) only covers the protected half. The unprotected 
half spans  [ BIT(realm->ia_bits - 1), BIT(realm->ia_bits))

Suzuki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ