[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <yq5apl86tteg.fsf@kernel.org>
Date: Mon, 22 Dec 2025 21:12:31 +0530
From: Aneesh Kumar K.V <aneesh.kumar@...nel.org>
To: Steven Price <steven.price@....com>, linux-kernel@...r.kernel.org,
iommu@...ts.linux.dev, linux-coco@...ts.linux.dev
Cc: Catalin Marinas <catalin.marinas@....com>, will@...nel.org,
maz@...nel.org, tglx@...utronix.de, robin.murphy@....com,
suzuki.poulose@....com, akpm@...ux-foundation.org, jgg@...pe.ca
Subject: Re: [PATCH v2 1/4] swiotlb: dma: its: Enforce host page-size
alignment for shared buffers
Steven Price <steven.price@....com> writes:
> On 21/12/2025 16:09, Aneesh Kumar K.V (Arm) wrote:
>> When running private-memory guests, the guest kernel must apply
>> additional constraints when allocating buffers that are shared with the
>> hypervisor.
>>
>> These shared buffers are also accessed by the host kernel and therefore
>> must be aligned to the host’s page size.
>>
>> On non-secure hosts, set_guest_memory_attributes() tracks memory at the
>> host PAGE_SIZE granularity. This creates a mismatch when the guest
>> applies attributes at 4K boundaries while the host uses 64K pages. In
>> such cases, the call returns -EINVAL, preventing the conversion of
>> memory regions from private to shared.
>>
>> Architectures such as Arm can tolerate realm physical address space PFNs
>> being mapped as shared memory, as incorrect accesses are detected and
>> reported as GPC faults. However, relying on this mechanism is unsafe and
>> can still lead to kernel crashes.
>>
>> This is particularly likely when guest_memfd allocations are mmapped and
>> accessed from userspace. Once exposed to userspace, we cannot guarantee
>> that applications will only access the intended 4K shared region rather
>> than the full 64K page mapped into their address space. Such userspace
>> addresses may also be passed back into the kernel and accessed via the
>> linear map, resulting in a GPC fault and a kernel crash.
>>
>> With CCA, although Stage-2 mappings managed by the RMM still operate at
>> a 4K granularity, shared pages must nonetheless be aligned to the
>> host-managed page size to avoid the issues described above.
>>
>> Introduce a new helper, mem_encryp_align(), to allow callers to enforce
>> the required alignment for shared buffers.
>>
>> The architecture-specific implementation of mem_encrypt_align() will be
>> provided in a follow-up patch.
>>
>> Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@...nel.org>
>> ---
>> arch/arm64/include/asm/mem_encrypt.h | 6 ++++++
>> arch/arm64/mm/mem_encrypt.c | 6 ++++++
>> drivers/irqchip/irq-gic-v3-its.c | 7 ++++---
>> include/linux/mem_encrypt.h | 7 +++++++
>> kernel/dma/contiguous.c | 10 ++++++++++
>> kernel/dma/direct.c | 6 ++++++
>> kernel/dma/pool.c | 6 ++++--
>> kernel/dma/swiotlb.c | 18 ++++++++++++------
>> 8 files changed, 55 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/mem_encrypt.h b/arch/arm64/include/asm/mem_encrypt.h
>> index d77c10cd5b79..b7ac143b81ce 100644
>> --- a/arch/arm64/include/asm/mem_encrypt.h
>> +++ b/arch/arm64/include/asm/mem_encrypt.h
>> @@ -17,6 +17,12 @@ int set_memory_encrypted(unsigned long addr, int numpages);
>> int set_memory_decrypted(unsigned long addr, int numpages);
>> bool force_dma_unencrypted(struct device *dev);
>>
>> +#define mem_encrypt_align mem_encrypt_align
>> +static inline size_t mem_encrypt_align(size_t size)
>> +{
>> + return size;
>> +}
>> +
>> int realm_register_memory_enc_ops(void);
>>
>> /*
>> diff --git a/arch/arm64/mm/mem_encrypt.c b/arch/arm64/mm/mem_encrypt.c
>> index 645c099fd551..deb364eadd47 100644
>> --- a/arch/arm64/mm/mem_encrypt.c
>> +++ b/arch/arm64/mm/mem_encrypt.c
>> @@ -46,6 +46,12 @@ int set_memory_decrypted(unsigned long addr, int numpages)
>> if (likely(!crypt_ops) || WARN_ON(!PAGE_ALIGNED(addr)))
>> return 0;
>>
>> + if (WARN_ON(!IS_ALIGNED(addr, mem_encrypt_align(PAGE_SIZE))))
>> + return 0;
>> +
>> + if (WARN_ON(!IS_ALIGNED(numpages << PAGE_SHIFT, mem_encrypt_align(PAGE_SIZE))))
>> + return 0;
>> +
>> return crypt_ops->decrypt(addr, numpages);
>> }
>> EXPORT_SYMBOL_GPL(set_memory_decrypted);
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index 467cb78435a9..ffb8ef3a1eb3 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -213,16 +213,17 @@ static gfp_t gfp_flags_quirk;
>> static struct page *its_alloc_pages_node(int node, gfp_t gfp,
>> unsigned int order)
>> {
>> + unsigned int new_order;
>> struct page *page;
>> int ret = 0;
>>
>> - page = alloc_pages_node(node, gfp | gfp_flags_quirk, order);
>> -
>> + new_order = get_order(mem_encrypt_align((PAGE_SIZE << order)));
>> + page = alloc_pages_node(node, gfp | gfp_flags_quirk, new_order);
>> if (!page)
>> return NULL;
>>
>> ret = set_memory_decrypted((unsigned long)page_address(page),
>> - 1 << order);
>> + 1 << new_order);
>> /*
>> * If set_memory_decrypted() fails then we don't know what state the
>> * page is in, so we can't free it. Instead we leak it.
>
> Don't you also need to update its_free_pages() in a similar manner so
> that the set_memory_encrypted()/free_pages() calls are done with the
> same order argument?
>
Yes, agreed — good point. The free path needs to mirror the allocation
path, so its_free_pages() should use the same order when calling
set_memory_encrypted()/decrypted() and free_pages(). I’ll update it
accordingly to keep the behavior symmetric and consistent. I also
noticed that swiotlb also need similar change.
-aneesh
Powered by blists - more mailing lists