linux-kernel - Re: [PATCH v5 2/3] arm64: mmu: avoid allocating pages while splitting the linear mapping

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXHU02hsrB64Zopo@e129823.arm.com>
Date: Thu, 22 Jan 2026 07:42:11 +0000
From: Yeoreum Yun <yeoreum.yun@....com>
To: Yang Shi <yang@...amperecomputing.com>
Cc: Ryan Roberts <ryan.roberts@....com>, Will Deacon <will@...nel.org>,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	linux-rt-devel@...ts.linux.dev, catalin.marinas@....com,
	akpm@...ux-oundation.org, david@...nel.org, kevin.brodsky@....com,
	quic_zhenhuah@...cinc.com, dev.jain@....com,
	chaitanyas.prakash@....com, bigeasy@...utronix.de,
	clrkwllms@...nel.org, rostedt@...dmis.org,
	lorenzo.stoakes@...cle.com, ardb@...nel.org, jackmanb@...gle.com,
	vbabka@...e.cz, mhocko@...e.com
Subject: Re: [PATCH v5 2/3] arm64: mmu: avoid allocating pages while
 splitting the linear mapping

On Wed, Jan 21, 2026 at 02:57:28PM -0800, Yang Shi wrote:
>
>
> On 1/21/26 2:20 AM, Ryan Roberts wrote:
> > On 21/01/2026 08:32, Yeoreum Yun wrote:
> > > > > > My concern is that if a secondary CPU can race and cause a split, that is
> > > > > > unsound because we have determined that although the primary CPU supports BBML2,
> > > > > > at least one of the secondary CPUs does not. So splitting a live mapping is unsafe.
> > > > > >
> > > > > > I just had a brief chat with Rutland, and he agrees that this _could_ be a
> > > > > > problem. Basically there is a window between onlining the secondary cpus and
> > > > > > entering the stop_machine() where one of those cpus _could_ end up doing
> > > > > > something that causes us to split the linear map.
> > > > If I remember correctly, split_kernel_leaf_mapping() does call
> > > > system_supports_bbml2_noabort() before doing real split. So we basically
> > > > should fall into two categories:
> > > >
> > > > 1. bbml2_noabort is supported on all cpus. Everything is fine.
> > > > 2. bbml2_noabort is not supported on all cpus. split_kernel_leaf_mapping()
> > > > just returns 0. Kernel doesn't split page table, so there won't be TLB
> > > > conflict issue. But the following page prot update may see unexpected block
> > > > mapping, then a   WARN  will be raised and it will return -EINVAL. So the
> > > > worst case is the caller will fail (IIRC all the callers of set_memory_*()
> > > > handle the failure), and we can know who is trying to change linear mapping
> > > > before the linear mapping gets                    finalized. AFAICT I
> > > > haven't seen such WARN yet.
> > Ahh good point! So this isn't quite as terrible as I was thinking.
>
> Yeah.
>
> >
> > > Thanks for the great detail :)
> > > I've missed system_supports_bbml2_noabort() in split_kernel_leaf_mapping().
> > >
> > > > > > I'm not immediately sure how to solve that.
> > > > Do we need some synchronization mechanism? If the linear mapping is not
> > > > finalized yet, split_kernel_leaf_mapping() will spin. For example, something
> > > > like this off the top of my head,
> > > >
> > > > DEFINE_STATIC_KEY_FALSE(linear_mapping_finalized);
> > > >
> > > > Once the linear mapping is finalized, we can call
> > > > static_branch_enable(&linear_mapping_finalized);
> > > >
> > > > In split_kernel_leaf_mapping(), we can just do:
> > > >
> > > > retry:
> > > >      if (!static_branch_likely(&linear_mapping_finalized))
> > > >          goto retry;
> > > >
> > Yuck... But I guess it might work as long as the primary thread never does
> > anything that would cause an attempt to split; otherwise we have a deadlock.
> >
> > > > There may be better way to handle it. But this case should be very unlikely
> > > > IMHO. It sounds crazy to have such complicated kernel threads run so early.
> > > > I'm not sure whether we should pay immediate attention to it or not.
> > I think we need to figure out if this is actually possible. We bring up the
> > secondary cpus, set system caps and finalize the linear map in smp_init().
> > That's called from kernel_init_freeable() which is called from kernel_init(),
> > which is invoked as a thread pinned to the boot cpu.
> >
> > sched_init_smp() is called after smp_init() (i.e. after the linear map is
> > finalized). I'm guessing (based on the name of sched_init_smp()) that nothing
> > other than the idle thread will run on any secondaries until after
> > sched_init_smp() is called? (I'd be greatful if anyone can confirm that).
> >
> > Rutland suggested that it's probably too early for any PM type stuff to be
> > running in the idle loop, so based on all of that, perhaps this is not a problem
> > after all and there is basically zero chance of a secondary cpu doing anything
> > that could cause a linear map split during this window?
> >
> > I'm inclined to leave this as is for now.
>
> I agree. I don't think this would be a real problem.
>
> Thanks,
> Yang

Although partially using GFP_ATOMIC might not be an issue given that
there is no contention at the moment,
technically using a memory allocation API inside stop_machine() is problematic
for PREEMPT_RT, and the relevant page tables should be pre-allocated.

That said, taking a step back (I’m not sure why I was being so stubborn about this),
since the kernel_alias area is mapped using block mappings,
a simple calculation based on your dm_meminfo patch should be sufficient to
determine the number of page tables that need to be pre-allocated for
splitting the linear mapping, without having to walk the page tables again.

So, after your dm_meminfo patch, I plan to respin this patch based on that.

Am I missing anything?

--
Sincerely,
Yeoreum Yun