lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250520103500.00003905@huawei.com>
Date: Tue, 20 May 2025 10:35:00 +0100
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Vijay Balakrishna <vijayb@...ux.microsoft.com>
CC: Borislav Petkov <bp@...en8.de>, Tony Luck <tony.luck@...el.com>, "Rob
 Herring" <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>, "Conor
 Dooley" <conor+dt@...nel.org>, James Morse <james.morse@....com>, "Mauro
 Carvalho Chehab" <mchehab@...nel.org>, Robert Richter <rric@...nel.org>,
	<linux-edac@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Tyler Hicks
	<code@...icks.com>, Marc Zyngier <maz@...nel.org>, Sascha Hauer
	<s.hauer@...gutronix.de>, Lorenzo Pieralisi <lpieralisi@...nel.org>,
	<devicetree@...r.kernel.org>
Subject: Re: [PATCH 1/3] drivers/edac: Add L1 and L2 error detection for A72

On Thu, 15 May 2025 17:06:11 -0700
Vijay Balakrishna <vijayb@...ux.microsoft.com> wrote:

> From: Sascha Hauer <s.hauer@...gutronix.de>
> 
> The Cortex A72 cores have error detection capabilities for
> the L1/L2 Caches, this patch adds a driver for them. The selected errors
> to detect/report are by reading CPU/L2 memory error syndrome registers.
> 
> Unfortunately there is no robust way to inject errors into the caches,
> so this driver doesn't contain any code to actually test it. It has
> been tested though with code taken from an older version [1] of this
> driver.  For reasons stated in thread [1], the error injection code is
> not suitable for mainline, so it is removed from the driver.
> 
> [1] https://lore.kernel.org/all/1521073067-24348-1-git-send-email-york.sun@nxp.com/#t
> 
> Signed-off-by: Sascha Hauer <s.hauer@...gutronix.de>
> Co-developed-by: Vijay Balakrishna <vijayb@...ux.microsoft.com>
> Signed-off-by: Vijay Balakrishna <vijayb@...ux.microsoft.com>
Hi.

Some issues with release of device_nodes in the of parsing code.

Jonathan

> diff --git a/drivers/edac/edac_a72.c b/drivers/edac/edac_a72.c
> new file mode 100644
> index 000000000000..13acd7e7cef0
> --- /dev/null
> +++ b/drivers/edac/edac_a72.c
> @@ -0,0 +1,233 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Cortex A72 EDAC L1 and L2 cache error detection
> + *
> + * Copyright (c) 2020 Pengutronix, Sascha Hauer <s.hauer@...gutronix.de>
> + *
> + * Based on Code from:
> + * Copyright (c) 2018, NXP Semiconductor
> + * Author: York Sun <york.sun@....com>
> + *
Trivial but this blank line adds nothing useful
> + */
> +
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/bitfield.h>
> +#include <asm/smp_plat.h>
> +

> +
> +static void cortex_arm64_edac_remove(struct platform_device *pdev)
> +{
> +	struct edac_device_ctl_info *edac_ctl = dev_get_drvdata(&pdev->dev);
> +
> +	edac_device_del_device(edac_ctl->dev);

Maybe worth thinking about about devm_ versions of the functions these
are undoing though not for this patch set.


> +	edac_device_free_ctl_info(edac_ctl);
> +}
> +
> +static const struct of_device_id cortex_arm64_edac_of_match[] = {
> +	{ .compatible = "arm,cortex-a72" },
> +	{}
> +};
> +MODULE_DEVICE_TABLE(of, cortex_arm64_edac_of_match);
> +
> +static struct platform_driver cortex_arm64_edac_driver = {
> +	.probe = cortex_arm64_edac_probe,
> +	.remove = cortex_arm64_edac_remove,
> +	.driver = {
> +		.name = DRVNAME,
> +	},
> +};
> +
> +static int __init cortex_arm64_edac_driver_init(void)
> +{
> +	struct device_node *np;
> +	int cpu;
> +	struct platform_device *pdev;
> +	int err;

Might as well have

	int err, cpu;

> +
> +	for_each_possible_cpu(cpu) {
> +		np = of_get_cpu_node(cpu, NULL);

		np = of_cpu_device_node_get(cpu);
is probably appropriate here.  See docs for of_get_cpu_node - that is
only meant for initial setup of the device_node to cpu logical
id mapping.  It uses an extra walk in the wrong direction.


> +
> +		if (!np) {
> +			pr_warn("failed to find device node for cpu %d\n", cpu);
> +			continue;
> +		}
> +		if (!of_match_node(cortex_arm64_edac_of_match, np))
> +			continue;

You are holding the reference to the node which should have been released.
If Borislav doesn't mind them in edac, use __free magic to handle this.
		struct device_node *np __free(device_node) = 
			of_cpu_device_node_get(cpu);

and don't manually release the node at all.  It will be released on scope
exit (so each iteration of the loop).  This is safe for !np test as well.

> +		if (!of_property_read_bool(np, "edac-enabled"))
> +			continue;
> +		cpumask_set_cpu(cpu, &compat_mask);
> +		of_node_put(np);
> +	}
> +
> +	if (cpumask_empty(&compat_mask))
> +		return 0;
> +
> +	err = platform_driver_register(&cortex_arm64_edac_driver);
> +	if (err)
> +		return err;
> +
> +	pdev = platform_device_register_simple(DRVNAME, -1, NULL, 0);
> +	if (IS_ERR(pdev)) {
> +		pr_err("failed to register cortex arm64 edac device\n");
> +		platform_driver_unregister(&cortex_arm64_edac_driver);
> +		return PTR_ERR(pdev);
> +	}
> +
> +	return 0;
> +}
> +

> +static void __exit cortex_arm64_edac_driver_exit(void)
> +{
> +		platform_driver_unregister(&cortex_arm64_edac_driver);

Looks like a bonus tab.

> +}
> +
> +module_init(cortex_arm64_edac_driver_init);
> +module_exit(cortex_arm64_edac_driver_exit);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Sascha Hauer <s.hauer@...gutronix.de>");
> +MODULE_DESCRIPTION("Cortex A72 L1 and L2 cache EDAC driver");


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ