[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0d660898-241b-450f-a3f0-3d09a1c97ae0@arm.com>
Date: Wed, 12 Nov 2025 11:29:48 +0530
From: Dev Jain <dev.jain@....com>
To: Yang Shi <yang@...amperecomputing.com>, Will Deacon <will@...nel.org>
Cc: catalin.marinas@....com, ryan.roberts@....com, rppt@...nel.org,
shijie@...amperecomputing.com, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH] arm64/pageattr: Propagate return value from
__change_memory_common
On 11/11/25 10:07 am, Dev Jain wrote:
>
> On 11/11/25 9:47 am, Yang Shi wrote:
>>
>>
>> On 11/10/25 7:39 PM, Dev Jain wrote:
>>>
>>> On 05/11/25 9:27 am, Dev Jain wrote:
>>>>
>>>> On 04/11/25 6:26 pm, Will Deacon wrote:
>>>>> On Tue, Nov 04, 2025 at 09:06:12AM +0530, Dev Jain wrote:
>>>>>> On 04/11/25 12:15 am, Yang Shi wrote:
>>>>>>> On 11/3/25 7:16 AM, Will Deacon wrote:
>>>>>>>> On Mon, Nov 03, 2025 at 11:43:06AM +0530, Dev Jain wrote:
>>>>>>>>> Post a166563e7ec3 ("arm64: mm: support large block mapping when
>>>>>>>>> rodata=full"),
>>>>>>>>> __change_memory_common has a real chance of failing due to split
>>>>>>>>> failure.
>>>>>>>>> Before that commit, this line was introduced in c55191e96caa,
>>>>>>>>> still having
>>>>>>>>> a chance of failing if it needs to allocate pagetable memory in
>>>>>>>>> apply_to_page_range, although that has never been observed to
>>>>>>>>> be true.
>>>>>>>>> In general, we should always propagate the return value to the
>>>>>>>>> caller.
>>>>>>>>>
>>>>>>>>> Cc: stable@...r.kernel.org
>>>>>>>>> Fixes: c55191e96caa ("arm64: mm: apply r/o permissions of VM
>>>>>>>>> areas to its linear alias as well")
>>>>>>>>> Signed-off-by: Dev Jain <dev.jain@....com>
>>>>>>>>> ---
>>>>>>>>> Based on Linux 6.18-rc4.
>>>>>>>>>
>>>>>>>>> arch/arm64/mm/pageattr.c | 5 ++++-
>>>>>>>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>>>>>>>>> index 5135f2d66958..b4ea86cd3a71 100644
>>>>>>>>> --- a/arch/arm64/mm/pageattr.c
>>>>>>>>> +++ b/arch/arm64/mm/pageattr.c
>>>>>>>>> @@ -148,6 +148,7 @@ static int change_memory_common(unsigned
>>>>>>>>> long addr, int numpages,
>>>>>>>>> unsigned long size = PAGE_SIZE * numpages;
>>>>>>>>> unsigned long end = start + size;
>>>>>>>>> struct vm_struct *area;
>>>>>>>>> + int ret;
>>>>>>>>> int i;
>>>>>>>>> if (!PAGE_ALIGNED(addr)) {
>>>>>>>>> @@ -185,8 +186,10 @@ static int change_memory_common(unsigned
>>>>>>>>> long addr, int numpages,
>>>>>>>>> if (rodata_full && (pgprot_val(set_mask) == PTE_RDONLY ||
>>>>>>>>> pgprot_val(clear_mask) == PTE_RDONLY)) {
>>>>>>>>> for (i = 0; i < area->nr_pages; i++) {
>>>>>>>>> - __change_memory_common((u64)page_address(area->pages[i]),
>>>>>>>>> + ret =
>>>>>>>>> __change_memory_common((u64)page_address(area->pages[i]),
>>>>>>>>> PAGE_SIZE, set_mask, clear_mask);
>>>>>>>>> + if (ret)
>>>>>>>>> + return ret;
>>>>>>>> Hmm, this means we can return failure half-way through the
>>>>>>>> operation. Is
>>>>>>>> that something callers are expecting to handle? If so, how can
>>>>>>>> they tell
>>>>>>>> how far we got?
>>>>>>> IIUC the callers don't have to know whether it is half-way or not
>>>>>>> because the callers will change the permission back (e.g. to RW)
>>>>>>> for the
>>>>>>> whole range when freeing memory.
>>>>>> Yes, it is the caller's responsibility to set
>>>>>> VM_FLUSH_RESET_PERMS flag.
>>>>>> Upon vfree(), it will change the direct map permissions back to RW.
>>>>> Ok, but vfree() ends up using update_range_prot() to do that and
>>>>> if we
>>>>> need to worry about that failing (as per your commit message), then
>>>>> we're in trouble because the calls to set_area_direct_map() are
>>>>> unchecked.
>>>>>
>>>>> In other words, this patch is either not necessary or it is
>>>>> incomplete.
>>>>
>>>> Here is the relevant email, in the discussion between Ryan and Yang:
>>>>
>>>> https://lore.kernel.org/all/fe52a1d8-5211-4962-afc8-c3f9caf64119@os.amperecomputing.com/
>>>>
>>>>
>>>> We had concluded that all callers of set_memory_ro() or
>>>> set_memory_rox() (which require the
>>>> linear map perm change back to default, upon vfree() ) will call it
>>>> for the entire region (vm_struct).
>>>> So, when we do the set_direct_map_invalid_noflush, it is guaranteed
>>>> that the region has already
>>>> been split. So this call cannot fail.
>>>>
>>>> https://lore.kernel.org/all/f8898c87-8f49-4ef2-86ae-b60bcf67658c@os.amperecomputing.com/
>>>>
>>>>
>>>> This email notes that there is some code doing set_memory_rw() and
>>>> unnecessarily setting the VM_FLUSH_RESET_PERMS
>>>> flag, but in that case we don't care about the
>>>> set_direct_map_invalid_noflush call failing because the protections
>>>> are already RW.
>>>>
>>>> Although we had also observed that all of this is fragile and
>>>> depends on the caller doing the
>>>> correct thing. The real solution should be somehow getting rid of
>>>> the BBM style invalidation.
>>>> Ryan had proposed some methods in that email thread.
>>>>
>>>> One solution which I had thought of, is that, observe that we are
>>>> doing an overkill by
>>>> setting the linear map to invalid and then default, for the
>>>> *entire* region. What we
>>>> can do is iterate over the linear map alias of the vm_struct *area
>>>> and only change permission
>>>> back to RW for the pages which are *not* RW. And, those relevant
>>>> mappings are guaranteed to
>>>> be split because they were changed from RW to not RW.
>>>
>>> @Yang and Ryan,
>>>
>>> I saw Yang's patch here:
>>> https://lore.kernel.org/all/20251023204428.477531-1-yang@os.amperecomputing.com/
>>>
>>> and realized that currently we are splitting away the linear map
>>> alias of the *entire* region.
>>>
>>> Shouldn't this then imply that set_direct_map_invalid_noflush will
>>> never fail, since even
>>>
>>> a set_memory_rox() call on a single page will split the linear map
>>> for the entire region,
>>>
>>> and thus there is no fragility here which we were discussing about?
>>> I may be forgetting
>>>
>>> something, this linear map stuff is confusing enough already.
>>
>> It still may fail due to page table allocation failure when doing
>> split. But it is still fine. We may run into 3 cases:
>>
>> 1. set_memory_rox succeed to split the whole range, then
>> set_direct_map_invalid_noflush() will succeed too
>> 2. set_memory_rox fails to split, for example, just change partial
>> range permission due to page table allocation failure, then
>> set_direct_map_invalid_noflush() may
>> a. successfully change the permission back to default till where
>> set_memory_rox fails at since that range has been successfully split.
>> It is ok since the remaining range is actually not changed to ro by
>> set_memory_rox at all
>> b. successfully change the permission back to default for the
>> whole range (for example, memory pressure is mitigated when
>> set_direct_map_invalid_noflush() is called). It is definitely fine as
>> well
>
> Correct, what I mean to imply here is that, your patch will break
> this? If set_memory_* is applied on x till y, your patch changes the
> linear map alias
>
> only from x till y - set_direct_map_invalid_noflush instead operates
> on 0 till size - 1, where 0 <=x <=y <= size - 1. So, it may encounter
> a -ENOMEM
>
> on [0, x) range while invalidating, and that is *not* okay because we
> must reset back [0, x) to default?
Okay I realize this is wrong, please read the last bit as "and this is
okay because [0, x) is default already", and therefore
your patch is correct.
Let me send a patch documenting this so we don't get confused in the
future, it is easy to forget all of this.
>
>
>>
>> Hopefully I don't miss anything.
>>
>> Thanks,
>> Yang
>>
>>
>>>
>>>
>>>>
>>>>>
>>>>> Will
>>>>
>>
>
Powered by blists - more mailing lists