[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <40682e84-edc1-4861-b078-a5067259cd73@arm.com>
Date: Thu, 13 Nov 2025 12:25:56 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Dev Jain <dev.jain@....com>, catalin.marinas@....com, will@...nel.org
Cc: rppt@...nel.org, shijie@...amperecomputing.com,
yang@...amperecomputing.com, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] arm64/mm: Document why linear map split failure upon
vm_reset_perms is not problematic
On 12/11/2025 06:27, Dev Jain wrote:
> Consider the following code path:
>
> (1) vmalloc -> (2) set_vm_flush_reset_perms -> (3) set_memory_ro/set_memory_rox
> -> .... (4) use the mapping .... -> (5) vfree -> (6) vm_reset_perms
> -> (7) set_area_direct_map.
> Or, it may happen that we encounter failure at (3) and directly jump to (5).
>
> In both cases, (7) may fail due to linear map split failure. But, we care
> about its success *only* for the region which got successfully changed by
> (3). Such a region is guaranteed to be pte-mapped.
I keep tripping over this bit because I keep forgetting that even if the vmalloc
region whose permissions are being changed contains a huge mapping, we still
change the linear map page-by-page. So the portion of the linear map is
guarranteed to be pte-mapped.
>
> The TLDR is that (7) will surely succeed for the regions we care about.
Appologies, we have definitely discussed this before, but I still can't quite
convince myself. Consider this:
static void vm_reset_perms(struct vm_struct *area)
{
...
/*
* Set direct map to something invalid so that it won't be cached if
* there are any accesses after the TLB flush, then flush the TLB and
* reset the direct map permissions to the default.
*/
set_area_direct_map(area, set_direct_map_invalid_noflush);
_vm_unmap_aliases(start, end, flush_dmap);
set_area_direct_map(area, set_direct_map_default_noflush);
}
If we have a situation where a region of 4M is allocated with vmalloc, which is
PMD-mapped, then only the second 2M has its permissions changed by
set_memory_ro[x], we end up with the first 2M not pte-mapped in the linear map
with default permissions and the second 2M pte-mapped in the linear with
non-default permissions.
The above code tries to set the whole vm area to invalid in the linear map
before issuing a TLB flush. This could fail for the first half of the area if we
are unable to allocate the PTE table to split to PTE (since
set_direct_map_default_noflush() is called page-by-page). So we end up with the
first half of the region valid with default permissions in the linear map and
the second half invalid when we do the TLB flush.
(The above code does it this way to a) reduce the number of TLB flushes to the
minimum - the same one covers both the linear map and the vmalloc invalidation,
and b) to ensure there is no window when memory has both X and W aliases.
Given it is guarranteed that if any of the linear map is still valid, then it's
permissions are the default ones and the vmalloc alias was never changed to
non-default permissions for that part, then I agree this is safe.
I've convinced myself that this is all safe and correct. Sorry for the babble:
Reviewed-by: Ryan Roberts <ryan.roberts@....com>
>
> Signed-off-by: Dev Jain <dev.jain@....com>
> ---
> arch/arm64/mm/pageattr.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
> index b4ea86cd3a71..dc05f06a47f2 100644
> --- a/arch/arm64/mm/pageattr.c
> +++ b/arch/arm64/mm/pageattr.c
> @@ -185,6 +185,15 @@ static int change_memory_common(unsigned long addr, int numpages,
> */
> if (rodata_full && (pgprot_val(set_mask) == PTE_RDONLY ||
> pgprot_val(clear_mask) == PTE_RDONLY)) {
> + /*
> + * Note: One may wonder what happens if the calls to
> + * set_area_direct_map() in vm_reset_perms() fail due ENOMEM on
> + * linear map split failure. Observe that we care about those
> + * calls to succeed *only* for the region whose permissions
> + * are not default. Such a region is guaranteed to be
> + * pte-mapped, because the below call can change those
> + * permissions to non-default only after splitting that region.
> + */
> for (i = 0; i < area->nr_pages; i++) {
> ret = __change_memory_common((u64)page_address(area->pages[i]),
> PAGE_SIZE, set_mask, clear_mask);
Powered by blists - more mailing lists