[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2025092909-litter-cornstalk-2178@gregkh>
Date: Mon, 29 Sep 2025 20:29:06 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: Wen Yang <wen.yang@...ux.dev>
Cc: linux-kernel@...r.kernel.org, Pierre Gondois <pierre.gondois@....com>,
Sudeep Holla <sudeep.holla@....com>,
Palmer Dabbelt <palmer@...osinc.com>, stable@...r.kernel.org
Subject: Re: [PATCH 6.1] arch_topology: Build cacheinfo from primary CPU
On Tue, Sep 30, 2025 at 01:57:40AM +0800, Wen Yang wrote:
>
>
> On 9/29/25 21:21, Greg Kroah-Hartman wrote:
> > On Sat, Sep 27, 2025 at 01:46:58AM +0800, Wen Yang wrote:
> > > From: Pierre Gondois <pierre.gondois@....com>
> > >
> > > commit 5944ce092b97caed5d86d961e963b883b5c44ee2 upstream.
> > >
>
> > > adds a call to detect_cache_attributes() to populate the cacheinfo
> > > before updating the siblings mask. detect_cache_attributes() allocates
> > > memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
> > > kernels, on secondary CPUs, this triggers a:
> > > 'BUG: sleeping function called from invalid context' [1]
> > > as the code is executed with preemption and interrupts disabled.
> > >
> > > The primary CPU was previously storing the cache information using
> > > the now removed (struct cpu_topology).llc_id:
> > > commit 5b8dc787ce4a ("arch_topology: Drop LLC identifier stash from
> > > the CPU topology")
> > >
> > > allocate_cache_info() tries to build the cacheinfo from the primary
> > > CPU prior secondary CPUs boot, if the DT/ACPI description
> > > contains cache information.
> > > If allocate_cache_info() fails, then fallback to the current state
> > > for the cacheinfo allocation. [1] will be triggered in such case.
> > >
> > > When unplugging a CPU, the cacheinfo memory cannot be freed. If it
> > > was, then the memory would be allocated early by the re-plugged
> > > CPU and would trigger [1].
> > >
> > > Note that populate_cache_leaves() might be called multiple times
> > > due to populate_leaves being moved up. This is required since
> > > detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
> > > being allocated but not populated.
> > >
> > > [1]:
> > > | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
> > > | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
> > > | preempt_count: 1, expected: 0
> > > | RCU nest depth: 1, expected: 1
> > > | 3 locks held by swapper/111/0:
> > > | #0: (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
> > > | #1: (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
> > > | #2: (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
> > > | irq event stamp: 0
> > > | hardirqs last enabled at (0): 0x0
> > > | hardirqs last disabled at (0): copy_process+0x5dc/0x1ab8
> > > | softirqs last enabled at (0): copy_process+0x5dc/0x1ab8
> > > | softirqs last disabled at (0): 0x0
> > > | Preemption disabled at:
> > > | migrate_enable+0x30/0x130
> > > | CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...]
> > > | Call trace:
> > > | __kmalloc+0xbc/0x1e8
> > > | detect_cache_attributes+0x2d4/0x5f0
> > > | update_siblings_masks+0x30/0x368
> > > | store_cpu_topology+0x78/0xb8
> > > | secondary_start_kernel+0xd0/0x198
> > > | __secondary_switched+0xb0/0xb4
> > >
> > > Signed-off-by: Pierre Gondois <pierre.gondois@....com>
> > > Reviewed-by: Sudeep Holla <sudeep.holla@....com>
> > > Acked-by: Palmer Dabbelt <palmer@...osinc.com>
> > > Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@arm.com
> > > Signed-off-by: Sudeep Holla <sudeep.holla@....com>
> > > Cc: <stable@...r.kernel.org> # 6.1.x: c3719bd:cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
> > > Cc: <stable@...r.kernel.org> # 6.1.x: 8844c3d:cacheinfo: Return error code in init_of_cache_level(
> > > Cc: <stable@...r.kernel.org> # 6.1.x: de0df44:cacheinfo: Check 'cache-unified' property to count cache leaves
> > > Cc: <stable@...r.kernel.org> # 6.1.x: fa4d566:ACPI: PPTT: Remove acpi_find_cache_levels()
> > > Cc: <stable@...r.kernel.org> # 6.1.x: bd50036:ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info(
> > > Cc: <stable@...r.kernel.org> # 6.1.x
> >
> > I do not understand, why do you want all of these applied as well? Can
> > you just send the full series of commits?
> >
> Thanks for your comments, here is the original series:
> https://lore.kernel.org/all/167404285593.885445.6219705651301997538.b4-ty@arm.com/
>
> commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection in the
> CPU hotplug path") introduced a bug, and this series fixed it.
>
> > > Signed-off-by: Wen Yang <wen.yang@...ux.dev>
> >
> > Also, you have changed this commit a lot from the original one, please
> > document what you did here.
> >
> Thanks for the reminder. We just hope to cherry-pick them onto the 6.1
> stable branch, without modifying the original commit.
> Also checked again, as follows:
>
> $ git cherry-pick c3719bd
> $ git cherry-pick 8844c3d
> $ git cherry-pick de0df44
> $ git cherry-pick fa4d566
> $ git cherry-pick bd50036
> $ git cherry-pick 5944ce0
>
> $ git format-patch HEAD -1
>
> $ diff 0001-arch_topology-Build-cacheinfo-from-primary-CPU.patch
> 20250927_wen_yang_arch_topology_build_cacheinfo_from_primary_cpu.mbx
Can you resend these all as a patch series with your signed-off-by on
them to show that you have tested them?
And again, the commit here did not seem to match up with the original
upstream version, but maybe my tools got it wrong. Resend the series
and I'll check it again.
thanks,
greg k-h
Powered by blists - more mailing lists