[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1a9284b-972a-4474-9151-0a2ed8558b9e@arm.com>
Date: Mon, 28 Jul 2025 16:12:08 +0530
From: Dev Jain <dev.jain@....com>
To: catalin.marinas@....com, will@...nel.org
Cc: anshuman.khandual@....com, quic_zhenhuah@...cinc.com,
ryan.roberts@....com, kevin.brodsky@....com, yangyicong@...ilicon.com,
joey.gouly@....com, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, mark.rutland@....com, maz@...nel.org,
stable@...r.kernel.org
Subject: Re: [PATCH] arm64/mm: Fix use-after-free due to race between memory
hotunplug and ptdump
On 28/07/25 4:01 pm, Dev Jain wrote:
> Memory hotunplug is done under the hotplug lock and ptdump walk is done
> under the init_mm.mmap_lock. Therefore, ptdump and hotunplug can run
> simultaneously without any synchronization. During hotunplug,
> free_empty_tables() is ultimately called to free up the pagetables.
> The following race can happen, where x denotes the level of the pagetable:
>
> CPU1 CPU2
> free_empty_pxd_table
> ptdump_walk_pgd()
> Get p(x+1)d table from pxd entry
> pxd_clear
> free_hotplug_pgtable_page(p(x+1)dp)
> Still using the p(x+1)d table
>
> which leads to a user-after-free.
>
> To solve this, we need to synchronize ptdump_walk_pgd() with
> free_hotplug_pgtable_page() in such a way that ptdump never takes a
> reference on a freed pagetable.
>
> Since this race is very unlikely to happen in practice, we do not want to
> penalize other code paths taking the init_mm mmap_lock. Therefore, we use
> static keys. ptdump will enable the static key - upon observing that,
> the free_empty_pxd_table() functions will get patched in with an
> mmap_read_lock/unlock sequence. A code comment explains in detail, how
> a combination of acquire semantics of static_branch_enable() and the
> barriers in __flush_tlb_kernel_pgtable() ensures that ptdump will never
> get a hold on the address of a freed pagetable - either ptdump will block
> the table freeing path due to write locking the mmap_lock, or, the nullity
> of the pxd entry will be observed by ptdump, therefore having no access to
> the isolated p(x+1)d pagetable.
>
> This bug was found by code inspection, as a result of working on [1].
> 1. https://lore.kernel.org/all/20250723161827.15802-1-dev.jain@arm.com/
>
> Cc: <stable@...r.kernel.org>
> Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
> Signed-off-by: Dev Jain <dev.jain@....com>
> ---
Immediately after posting, I guess the first objection which is going to
come is, why not just nest free_empty_tables() with mmap_read_lock/unlock.
Memory offlining obviously should not be a hot path so taking the read lock
should be fine I guess.
Powered by blists - more mailing lists