[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250520103500.00003905@huawei.com>
Date: Tue, 20 May 2025 10:35:00 +0100
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Vijay Balakrishna <vijayb@...ux.microsoft.com>
CC: Borislav Petkov <bp@...en8.de>, Tony Luck <tony.luck@...el.com>, "Rob
Herring" <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>, "Conor
Dooley" <conor+dt@...nel.org>, James Morse <james.morse@....com>, "Mauro
Carvalho Chehab" <mchehab@...nel.org>, Robert Richter <rric@...nel.org>,
<linux-edac@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Tyler Hicks
<code@...icks.com>, Marc Zyngier <maz@...nel.org>, Sascha Hauer
<s.hauer@...gutronix.de>, Lorenzo Pieralisi <lpieralisi@...nel.org>,
<devicetree@...r.kernel.org>
Subject: Re: [PATCH 1/3] drivers/edac: Add L1 and L2 error detection for A72
On Thu, 15 May 2025 17:06:11 -0700
Vijay Balakrishna <vijayb@...ux.microsoft.com> wrote:
> From: Sascha Hauer <s.hauer@...gutronix.de>
>
> The Cortex A72 cores have error detection capabilities for
> the L1/L2 Caches, this patch adds a driver for them. The selected errors
> to detect/report are by reading CPU/L2 memory error syndrome registers.
>
> Unfortunately there is no robust way to inject errors into the caches,
> so this driver doesn't contain any code to actually test it. It has
> been tested though with code taken from an older version [1] of this
> driver. For reasons stated in thread [1], the error injection code is
> not suitable for mainline, so it is removed from the driver.
>
> [1] https://lore.kernel.org/all/1521073067-24348-1-git-send-email-york.sun@nxp.com/#t
>
> Signed-off-by: Sascha Hauer <s.hauer@...gutronix.de>
> Co-developed-by: Vijay Balakrishna <vijayb@...ux.microsoft.com>
> Signed-off-by: Vijay Balakrishna <vijayb@...ux.microsoft.com>
Hi.
Some issues with release of device_nodes in the of parsing code.
Jonathan
> diff --git a/drivers/edac/edac_a72.c b/drivers/edac/edac_a72.c
> new file mode 100644
> index 000000000000..13acd7e7cef0
> --- /dev/null
> +++ b/drivers/edac/edac_a72.c
> @@ -0,0 +1,233 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Cortex A72 EDAC L1 and L2 cache error detection
> + *
> + * Copyright (c) 2020 Pengutronix, Sascha Hauer <s.hauer@...gutronix.de>
> + *
> + * Based on Code from:
> + * Copyright (c) 2018, NXP Semiconductor
> + * Author: York Sun <york.sun@....com>
> + *
Trivial but this blank line adds nothing useful
> + */
> +
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/bitfield.h>
> +#include <asm/smp_plat.h>
> +
> +
> +static void cortex_arm64_edac_remove(struct platform_device *pdev)
> +{
> + struct edac_device_ctl_info *edac_ctl = dev_get_drvdata(&pdev->dev);
> +
> + edac_device_del_device(edac_ctl->dev);
Maybe worth thinking about about devm_ versions of the functions these
are undoing though not for this patch set.
> + edac_device_free_ctl_info(edac_ctl);
> +}
> +
> +static const struct of_device_id cortex_arm64_edac_of_match[] = {
> + { .compatible = "arm,cortex-a72" },
> + {}
> +};
> +MODULE_DEVICE_TABLE(of, cortex_arm64_edac_of_match);
> +
> +static struct platform_driver cortex_arm64_edac_driver = {
> + .probe = cortex_arm64_edac_probe,
> + .remove = cortex_arm64_edac_remove,
> + .driver = {
> + .name = DRVNAME,
> + },
> +};
> +
> +static int __init cortex_arm64_edac_driver_init(void)
> +{
> + struct device_node *np;
> + int cpu;
> + struct platform_device *pdev;
> + int err;
Might as well have
int err, cpu;
> +
> + for_each_possible_cpu(cpu) {
> + np = of_get_cpu_node(cpu, NULL);
np = of_cpu_device_node_get(cpu);
is probably appropriate here. See docs for of_get_cpu_node - that is
only meant for initial setup of the device_node to cpu logical
id mapping. It uses an extra walk in the wrong direction.
> +
> + if (!np) {
> + pr_warn("failed to find device node for cpu %d\n", cpu);
> + continue;
> + }
> + if (!of_match_node(cortex_arm64_edac_of_match, np))
> + continue;
You are holding the reference to the node which should have been released.
If Borislav doesn't mind them in edac, use __free magic to handle this.
struct device_node *np __free(device_node) =
of_cpu_device_node_get(cpu);
and don't manually release the node at all. It will be released on scope
exit (so each iteration of the loop). This is safe for !np test as well.
> + if (!of_property_read_bool(np, "edac-enabled"))
> + continue;
> + cpumask_set_cpu(cpu, &compat_mask);
> + of_node_put(np);
> + }
> +
> + if (cpumask_empty(&compat_mask))
> + return 0;
> +
> + err = platform_driver_register(&cortex_arm64_edac_driver);
> + if (err)
> + return err;
> +
> + pdev = platform_device_register_simple(DRVNAME, -1, NULL, 0);
> + if (IS_ERR(pdev)) {
> + pr_err("failed to register cortex arm64 edac device\n");
> + platform_driver_unregister(&cortex_arm64_edac_driver);
> + return PTR_ERR(pdev);
> + }
> +
> + return 0;
> +}
> +
> +static void __exit cortex_arm64_edac_driver_exit(void)
> +{
> + platform_driver_unregister(&cortex_arm64_edac_driver);
Looks like a bonus tab.
> +}
> +
> +module_init(cortex_arm64_edac_driver_init);
> +module_exit(cortex_arm64_edac_driver_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Sascha Hauer <s.hauer@...gutronix.de>");
> +MODULE_DESCRIPTION("Cortex A72 L1 and L2 cache EDAC driver");
Powered by blists - more mailing lists