[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CY8PR11MB71345FDE3DF74BAF97B563F08973A@CY8PR11MB7134.namprd11.prod.outlook.com>
Date: Tue, 17 Jun 2025 14:09:42 +0000
From: "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>
To: Borislav Petkov <bp@...en8.de>, "Luck, Tony" <tony.luck@...el.com>
CC: Marek Marczykowski-Górecki
<marmarek@...isiblethingslab.com>, "open list:EDAC-IGEN6"
<linux-edac@...r.kernel.org>, open list <linux-kernel@...r.kernel.org>
Subject: RE: NULL pointer dereference in igen6_probe - 6.16-rc2
Hi Boris,
> From: Borislav Petkov <bp@...en8.de>
> [...]
> > [ 13.565035] EDAC MC0: Giving out device to module igen6_edac controller
> Intel_client_SoC MC#0: DEV 0000:00:00.0 (INTERRUPT)
> > [ 13.565746] EDAC igen6: Expected 2 mcs, but only 1 detected.
>
> Well, folks, if you've detected only one memory controller, then work with
> only one and do not kill the machine:
>
Yes.
> diff --git a/drivers/edac/igen6_edac.c b/drivers/edac/igen6_edac.c index
> 1930dc00c791..23e26ba2d49b 100644
> --- a/drivers/edac/igen6_edac.c
> +++ b/drivers/edac/igen6_edac.c
> @@ -1350,9 +1350,11 @@ static int igen6_register_mcis(struct pci_dev *pdev,
> u64 mchbar)
> return -ENODEV;
> }
>
> - if (lmc < res_cfg->num_imc)
> + if (lmc < res_cfg->num_imc) {
> igen6_printk(KERN_WARNING, "Expected %d mcs, but
> only %d detected.",
> res_cfg->num_imc, lmc);
> + res_cfg->num_imc = lmc;
> + }
>
> return 0;
>
> ---
>
> but then that cfg struct is const :-\
>
> drivers/edac/igen6_edac.c: In function ‘igen6_register_mcis’:
> drivers/edac/igen6_edac.c:1356:34: error: assignment of member ‘num_imc’
> in read-only object
> 1356 | res_cfg->num_imc = lmc;
> | ^
>
>
> Unless it is some gunky crap this coreboot does - then we will have to have a
> longer talk.
>
> 😝
In the 10nm_edac driver for Intel Xeon server, 'cfg' is non-const, and the field
'cfg->ddr_imc_num' [1] is overwritten with the number of detected DDR memory
controllers at runtime.
Reverting 'cfg' in this igen6_edac driver to non-const, allowing it to be set
with the actual number of detected memory controllers seems reasonable.
After that then applying Boris' fix above is the simplest way to resolve the
issue. 😊
[1] https://github.com/torvalds/linux/blob/master/drivers/edac/i10nm_base.c#L479
Thanks.
-Qiuxu
Powered by blists - more mailing lists