lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45dc2746-153a-482d-954f-11fe1cd8d18e@os.amperecomputing.com>
Date: Wed, 17 Sep 2025 12:15:27 -0700
From: Yang Shi <yang@...amperecomputing.com>
To: Ryan Roberts <ryan.roberts@....com>, Dev Jain <dev.jain@....com>,
 Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 Andrew Morton <akpm@...ux-foundation.org>,
 David Hildenbrand <david@...hat.com>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Ard Biesheuvel <ardb@...nel.org>, scott@...amperecomputing.com, cl@...two.org
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org
Subject: Re: [PATCH v7 0/6] arm64: support FEAT_BBM level 2 and large block
 mapping when rodata=full



On 9/17/25 11:58 AM, Ryan Roberts wrote:
> On 17/09/2025 18:21, Yang Shi wrote:
>>
>> On 9/17/25 9:28 AM, Ryan Roberts wrote:
>>> Hi Yang,
>>>
>>> Sorry for the slow reply; I'm just getting back to this...
>>>
>>> On 11/09/2025 23:03, Yang Shi wrote:
>>>> Hi Ryan & Catalin,
>>>>
>>>> Any more concerns about this?
>>> I've been trying to convince myself that your assertion that all users that set
>>> the VM_FLUSH_RESET_PERMS also call set_memory_*() for the entire range that was
>>> returned my vmalloc. I agree that if that is the contract and everyone is
>>> following it, then there is no problem here.
>>>
>>> But I haven't been able to convince myself...
>>>
>>> Some examples (these might intersect with examples you previously raised):
>>>
>>> 1. bpf_dispatcher_change_prog() -> bpf_jit_alloc_exec() -> execmem_alloc() ->
>>> sets VM_FLUSH_RESET_PERMS. But I don't see it calling set_memory_*() for
>>> rw_image.
>> Yes, it doesn't call set_memory_*(). I spotted this in the earlier email. But it
>> is actually RW, so it should be ok to miss the call. The later
>> set_direct_map_invalid call in vfree() may fail, but set_direct_map_default call
>> will set RW permission back. But I think it doesn't have to use execmem_alloc(),
>> the plain vmalloc() should be good enough.
>>
>>> 2. module_memory_alloc() -> execmem_alloc_rw() -> execmem_alloc() -> sets
>>> VM_FLUSH_RESET_PERMS (note that execmem_force_rw() is nop for arm64).
>>> set_memory_*() is not called until much later on in module_set_memory(). Another
>>> error in the meantime could cause the memory to be vfreed before that point.
>> IIUC, execmem_alloc_rw() is used to allocate memory for modules' text section
>> and data section. The code will set mod->mem[type].is_rox according to the type
>> of the section. It is true for text, false for data. Then set_memory_rox() will
>> be called later if it is true *after* insns are copied to the memory. So it is
>> still RW before that point.
>>
>>> 3. When set_vm_flush_reset_perms() is set for the range, it is called before
>>> set_memory_*() which might then fail to split prior to vfree.
>> Yes, all call sites check the return value and bail out if set_memory_*() failed
>> if I don't miss anything.
>>
>>> But I guess as long as set_memory_*() is never successfully called for a
>>> *sub-range* of the vmalloc'ed region, then for all of the above issues, the
>>> memory must still be RW at vfree-time, so this issue should be benign... I think?
>> Yes, it is true.
> So to summarise, all freshly vmalloc'ed memory starts as RW. set_memory_*() may
> only be called if VM_FLUSH_RESET_PERMS has already been set. If set_memory_*()
> is called at all, the first call MUST be for the whole range.

Whether the default permission is RW or not depends on the type passed 
in by execmem_alloc(). It is defined by execmem_info in 
arch/arm64/mm/init.c. For ARM64, module and BPF have PAGE_KERNEL 
permission (RW) by default, but kprobes is PAGE_KERNEL_ROX (ROX).

> If those requirements are all met, then if VM_FLUSH_RESET_PERMS was set but
> set_memory_*() was never called, the worst that can happen is for both the
> set_direct_map_invalid() and set_direct_map_default() calls to fail due to not
> enough memory. But that is safe because the memory was always RW. If
> set_memory_*() was called for the whole range and failed, it's the same as if it
> was never called. If it was called for the whole range and succeeded, then the
> split must have happened already and set_direct_map_invalid() and
> set_direct_map_default() will therefore definitely succeed.
>
> The only way this could be a problem is if someone vmallocs a range then
> performs a set_memory_*() on a sub-region without having first done it for the
> whole region. But we have not found any evidence that there are any users that
> do that.

Yes, exactly.

>
> In fact, by that logic, I think alloc_insn_page() must also be safe; it only
> allocates 1 page, so if set_memory_*() is subsequently called for it, it must by
> definition be covering the whole allocation; 1 page is the smallest amount that
> can be protected.

Yes, but kprobes default permission is ROX.

>
> So I agree we are safe.
>
>
>>> In summary this all looks horribly fragile. But I *think* it works. It would be
>>> good to clean it all up and have some clearly documented rules regardless. But I
>>> think that could be a follow up series.
>> Yeah, absolutely agreed.
>>
>>>> Shall we move forward with v8?
>>> Yes; Do you wnat me to post that or would you prefer to do it? I'm happy to do
>>> it; there are a few other tidy ups in pageattr.c I want to make which I spotted.
>> I actually just had v8 ready in my tree. I removed pageattr_pgd_entry and
>> pageattr_pud_entry in pageattr.c and fixed pmd_leaf/pud_leaf as you suggested.
>> Is it the cleanup you are supposed to do?
> I was also going to fix up the comment in change_memory_common() which is now stale.

Oops, I missed that in my v8. Please just comment for v8, I can fix it 
up later.

Thanks,
Yang


>
>> And I also rebased it on top of
>> Shijie's series (https://git.kernel.org/pub/scm/linux/kernel/git/arm64/
>> linux.git/commit/?id=bfbbb0d3215f) which has been picked up by Will.
>>
>>>> We can include the
>>>> fix to kprobes in v8 or I can send it separately, either is fine to me.
>>> Post it on list, and I'll also incorporate into the series.
>> I can include it in v8 series.
>>
>>>> Hopefully we can make v6.18.
>>> It's probably getting a bit late now. Anyway, I'll aim to get v8 out tomorrow or
>>> Friday and we will see what Will thinks.
>> Thank you. I can post v8 today.
> OK great - I'll leave it all to you then - thanks!
>
>> Thanks,
>> Yang
>>
>>> Thanks,
>>> Ryan
>>>
>>>> Thanks,
>>>> Yang
>>>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ