[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2f594c55-6fce-445f-b838-ea518916103f@arm.com>
Date: Wed, 2 Apr 2025 13:17:20 +0100
From: Robin Murphy <robin.murphy@....com>
To: HongBo Yao <andy.xu@...micro.com>, will@...nel.org
Cc: mark.rutland@....com, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, allen.wang@...micro.com, peter.du@...micro.com
Subject: Re: [PATCH v2] perf: arm-ni: Fix list_add() corruption in
arm_ni_probe()
On 2025-03-31 1:15 pm, HongBo Yao wrote:
> From: Hongbo Yao <andy.xu@...micro.com>
>
> When a resource allocation fails in one clock domain of an NI device,
> we need to properly roll back all previously registered perf PMUs in
> other clock domains of the same device.
>
> Otherwise, it can lead to kernel panics.
>
> Calling arm_ni_init+0x0/0xff8 [arm_ni] @ 2374
> arm-ni ARMHCB70:00: Failed to request PMU region 0x1f3c13000
> arm-ni ARMHCB70:00: probe with driver arm-ni failed with error -16
> list_add corruption: next->prev should be prev (fffffd01e9698a18),
> but was 0000000000000000. (next=ffff10001a0decc8).
> pstate: 6340009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> pc : list_add_valid_or_report+0x7c/0xb8
> lr : list_add_valid_or_report+0x7c/0xb8
> Call trace:
> __list_add_valid_or_report+0x7c/0xb8
> perf_pmu_register+0x22c/0x3a0
> arm_ni_probe+0x554/0x70c [arm_ni]
> platform_probe+0x70/0xe8
> really_probe+0xc6/0x4d8
> driver_probe_device+0x48/0x170
> __driver_attach+0x8e/0x1c0
> bus_for_each_dev+0x64/0xf0
> driver_add+0x138/0x260
> bus_add_driver+0x68/0x138
> __platform_driver_register+0x2c/0x40
> arm_ni_init+0x14/0x2a [arm_ni]
> do_init_module+0x36/0x298
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Oops - BUG: Fatal exception
> SMP: stopping secondary CPUs
This, and the subject, are really confusing, since apparent list
corruption is just one of many possible symptoms, and the fact that it
happens to be a second instance of arm_ni_probe() hitting it is pure
coincidence. The bug is that probe failure can lead to PMUs being freed
without being unregistered, which can obviously lead to all manner of
use-after-free conditions in the perf core.
> Fixes: 4d5a7680f2b4 ("perf: Add driver for Arm NI-700 interconnect PMU")
> Signed-off-by: Hongbo Yao <andy.xu@...micro.com>
> ---
> drivers/perf/arm-ni.c | 44 +++++++++++++++++++++++++++++--------------
> 1 file changed, 30 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/perf/arm-ni.c b/drivers/perf/arm-ni.c
> index fd7a5e60e963..ee85577e86b9 100644
> --- a/drivers/perf/arm-ni.c
> +++ b/drivers/perf/arm-ni.c
> @@ -102,6 +102,7 @@ struct arm_ni_unit {
> struct arm_ni_cd {
> void __iomem *pmu_base;
> u16 id;
> + bool pmu_registered;
> int num_units;
> int irq;
> int cpu;
> @@ -571,10 +572,31 @@ static int arm_ni_init_cd(struct arm_ni *ni, struct arm_ni_node *node, u64 res_s
> err = perf_pmu_register(&cd->pmu, name, -1);
> if (err)
> cpuhp_state_remove_instance_nocalls(arm_ni_hp_state, &cd->cpuhp_node);
> + else
> + cd->pmu_registered = true;
>
> return err;
> }
>
> +static void arm_ni_remove_cds(struct arm_ni *ni)
> +{
> + for (int i = 0; i < ni->num_cds; i++) {
> + struct arm_ni_cd *cd = ni->cds + i;
> +
> + if (!cd->pmu_base)
> + continue;
> +
> + if (!cd->pmu_registered)
> + continue;
It's clearly redundant to have both checks here - TBH I'd be inclined to
just do the minimal fix with the existing pmu_base logic:
@@ -656,8 +673,11 @@ static int arm_ni_probe(struct platform_device *pdev)
reg = readl_relaxed(pd.base + NI_CHILD_PTR(c));
arm_ni_probe_domain(base + reg, &cd);
ret = arm_ni_init_cd(ni, &cd, res->start);
- if (ret)
+ if (ret) {
+ ni->cds[cd.id].pmu_base = NULL;
+ arm_ni_remove(pdev);
return ret;
+ }
}
}
}
...however I'm not against replacing that use of pmu_base with an
explicit flag if you think that's nicer. As implied, though, I really
don't see any point in splitting up arm_ni_remove() itself, we can just
move the whole definition up and call it for this purpose too (with your
other drvdata fix).
Thanks,
Robin.
Powered by blists - more mailing lists