linux-kernel - Re: [PATCH v2 1/4] swiotlb: dma: its: Enforce host page-size alignment for shared buffers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yq5apl86tteg.fsf@kernel.org>
Date: Mon, 22 Dec 2025 21:12:31 +0530
From: Aneesh Kumar K.V <aneesh.kumar@...nel.org>
To: Steven Price <steven.price@....com>, linux-kernel@...r.kernel.org,
	iommu@...ts.linux.dev, linux-coco@...ts.linux.dev
Cc: Catalin Marinas <catalin.marinas@....com>, will@...nel.org,
	maz@...nel.org, tglx@...utronix.de, robin.murphy@....com,
	suzuki.poulose@....com, akpm@...ux-foundation.org, jgg@...pe.ca
Subject: Re: [PATCH v2 1/4] swiotlb: dma: its: Enforce host page-size
 alignment for shared buffers

Steven Price <steven.price@....com> writes:

> On 21/12/2025 16:09, Aneesh Kumar K.V (Arm) wrote:
>> When running private-memory guests, the guest kernel must apply
>> additional constraints when allocating buffers that are shared with the
>> hypervisor.
>> 
>> These shared buffers are also accessed by the host kernel and therefore
>> must be aligned to the host’s page size.
>> 
>> On non-secure hosts, set_guest_memory_attributes() tracks memory at the
>> host PAGE_SIZE granularity. This creates a mismatch when the guest
>> applies attributes at 4K boundaries while the host uses 64K pages. In
>> such cases, the call returns -EINVAL, preventing the conversion of
>> memory regions from private to shared.
>> 
>> Architectures such as Arm can tolerate realm physical address space PFNs
>> being mapped as shared memory, as incorrect accesses are detected and
>> reported as GPC faults. However, relying on this mechanism is unsafe and
>> can still lead to kernel crashes.
>> 
>> This is particularly likely when guest_memfd allocations are mmapped and
>> accessed from userspace. Once exposed to userspace, we cannot guarantee
>> that applications will only access the intended 4K shared region rather
>> than the full 64K page mapped into their address space. Such userspace
>> addresses may also be passed back into the kernel and accessed via the
>> linear map, resulting in a GPC fault and a kernel crash.
>> 
>> With CCA, although Stage-2 mappings managed by the RMM still operate at
>> a 4K granularity, shared pages must nonetheless be aligned to the
>> host-managed page size to avoid the issues described above.
>> 
>> Introduce a new helper, mem_encryp_align(), to allow callers to enforce
>> the required alignment for shared buffers.
>> 
>> The architecture-specific implementation of mem_encrypt_align() will be
>> provided in a follow-up patch.
>> 
>> Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@...nel.org>
>> ---
>>  arch/arm64/include/asm/mem_encrypt.h |  6 ++++++
>>  arch/arm64/mm/mem_encrypt.c          |  6 ++++++
>>  drivers/irqchip/irq-gic-v3-its.c     |  7 ++++---
>>  include/linux/mem_encrypt.h          |  7 +++++++
>>  kernel/dma/contiguous.c              | 10 ++++++++++
>>  kernel/dma/direct.c                  |  6 ++++++
>>  kernel/dma/pool.c                    |  6 ++++--
>>  kernel/dma/swiotlb.c                 | 18 ++++++++++++------
>>  8 files changed, 55 insertions(+), 11 deletions(-)
>> 
>> diff --git a/arch/arm64/include/asm/mem_encrypt.h b/arch/arm64/include/asm/mem_encrypt.h
>> index d77c10cd5b79..b7ac143b81ce 100644
>> --- a/arch/arm64/include/asm/mem_encrypt.h
>> +++ b/arch/arm64/include/asm/mem_encrypt.h
>> @@ -17,6 +17,12 @@ int set_memory_encrypted(unsigned long addr, int numpages);
>>  int set_memory_decrypted(unsigned long addr, int numpages);
>>  bool force_dma_unencrypted(struct device *dev);
>>  
>> +#define mem_encrypt_align mem_encrypt_align
>> +static inline size_t mem_encrypt_align(size_t size)
>> +{
>> +	return size;
>> +}
>> +
>>  int realm_register_memory_enc_ops(void);
>>  
>>  /*
>> diff --git a/arch/arm64/mm/mem_encrypt.c b/arch/arm64/mm/mem_encrypt.c
>> index 645c099fd551..deb364eadd47 100644
>> --- a/arch/arm64/mm/mem_encrypt.c
>> +++ b/arch/arm64/mm/mem_encrypt.c
>> @@ -46,6 +46,12 @@ int set_memory_decrypted(unsigned long addr, int numpages)
>>  	if (likely(!crypt_ops) || WARN_ON(!PAGE_ALIGNED(addr)))
>>  		return 0;
>>  
>> +	if (WARN_ON(!IS_ALIGNED(addr, mem_encrypt_align(PAGE_SIZE))))
>> +		return 0;
>> +
>> +	if (WARN_ON(!IS_ALIGNED(numpages << PAGE_SHIFT, mem_encrypt_align(PAGE_SIZE))))
>> +		return 0;
>> +
>>  	return crypt_ops->decrypt(addr, numpages);
>>  }
>>  EXPORT_SYMBOL_GPL(set_memory_decrypted);
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index 467cb78435a9..ffb8ef3a1eb3 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -213,16 +213,17 @@ static gfp_t gfp_flags_quirk;
>>  static struct page *its_alloc_pages_node(int node, gfp_t gfp,
>>  					 unsigned int order)
>>  {
>> +	unsigned int new_order;
>>  	struct page *page;
>>  	int ret = 0;
>>  
>> -	page = alloc_pages_node(node, gfp | gfp_flags_quirk, order);
>> -
>> +	new_order = get_order(mem_encrypt_align((PAGE_SIZE << order)));
>> +	page = alloc_pages_node(node, gfp | gfp_flags_quirk, new_order);
>>  	if (!page)
>>  		return NULL;
>>  
>>  	ret = set_memory_decrypted((unsigned long)page_address(page),
>> -				   1 << order);
>> +				   1 << new_order);
>>  	/*
>>  	 * If set_memory_decrypted() fails then we don't know what state the
>>  	 * page is in, so we can't free it. Instead we leak it.
>
> Don't you also need to update its_free_pages() in a similar manner so
> that the set_memory_encrypted()/free_pages() calls are done with the
> same order argument?
>

Yes, agreed — good point. The free path needs to mirror the allocation
path, so its_free_pages() should use the same order when calling
set_memory_encrypted()/decrypted() and free_pages(). I’ll update it
accordingly to keep the behavior symmetric and consistent. I also
noticed that swiotlb also need similar change.

-aneesh