lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c58ba57-96ff-4d9b-a014-dfe3fc0c44b5@stanley.mountain>
Date: Tue, 11 Mar 2025 11:12:54 +0300
From: Dan Carpenter <dan.carpenter@...aro.org>
To: David Thompson <davthompson@...dia.com>
Cc: Borislav Petkov <bp@...en8.de>, Shravan Ramani <shravankr@...dia.com>,
	"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [bug report] EDAC, mellanox: Add ECC support for BlueField DDR4

On Tue, Mar 11, 2025 at 02:22:20AM +0000, David Thompson wrote:
> > -----Original Message-----
> > From: Borislav Petkov <bp@...en8.de>
> > Sent: Monday, March 3, 2025 9:52 AM
> > To: Dan Carpenter <dan.carpenter@...aro.org>; David Thompson
> > <davthompson@...dia.com>
> > Cc: Shravan Ramani <sramani@...lanox.com>; linux-edac@...r.kernel.org; lkml
> > <linux-kernel@...r.kernel.org>
> > Subject: Re: [bug report] EDAC, mellanox: Add ECC support for BlueField DDR4
> > 
> > On Thu, Oct 24, 2024 at 11:20:45AM +0300, Dan Carpenter wrote:
> > > Hello Shravan Kumar Ramani,
> > >
> > > Commit 82413e562ea6 ("EDAC, mellanox: Add ECC support for BlueField
> > > DDR4") from Jun 25, 2019 (linux-next), leads to the following Smatch
> > > static checker warning:
> > >
> > > drivers/edac/bluefield_edac.c:205 bluefield_gather_report_ecc() error:
> > uninitialized symbol 'dram_syndrom'.
> > > drivers/edac/bluefield_edac.c:219 bluefield_gather_report_ecc() error:
> > uninitialized symbol 'dram_additional_info'.
> > > drivers/edac/bluefield_edac.c:231 bluefield_gather_report_ecc() error:
> > uninitialized symbol 'edea0'.
> > > drivers/edac/bluefield_edac.c:231 bluefield_gather_report_ecc() error:
> > uninitialized symbol 'edea1'.
> > > drivers/edac/bluefield_edac.c:256 bluefield_edac_check() error: uninitialized
> > symbol 'ecc_count'.
> > >
> > > drivers/edac/bluefield_edac.c
> > >     173 static void bluefield_gather_report_ecc(struct mem_ctl_info *mci,
> > >     174                                         int error_cnt,
> > >     175                                         int is_single_ecc)
> > >     176 {
> > >     177         struct bluefield_edac_priv *priv = mci->pvt_info;
> > >     178         u32 dram_additional_info, err_prank, edea0, edea1;
> > >     179         u32 ecc_latch_select, dram_syndrom, serr, derr, syndrom;
> > >     180         enum hw_event_mc_err_type ecc_type;
> > >     181         u64 ecc_dimm_addr;
> > >     182         int ecc_dimm, err;
> > >     183
> > >     184         ecc_type = is_single_ecc ? HW_EVENT_ERR_CORRECTED :
> > >     185                                    HW_EVENT_ERR_UNCORRECTED;
> > >     186
> > >     187         /*
> > >     188          * Tell the External Memory Interface to populate the relevant
> > >     189          * registers with information about the last ECC error occurrence.
> > >     190          */
> > >     191         ecc_latch_select = MLXBF_ECC_LATCH_SEL__START;
> > >     192         err = bluefield_edac_writel(priv, MLXBF_ECC_LATCH_SEL,
> > ecc_latch_select);
> > >     193         if (err)
> > >     194                 dev_err(priv->dev, "ECC latch select write failed.\n");
> > >     195
> > >     196         /*
> > >     197          * Verify that the ECC reported info in the registers is of the
> > >     198          * same type as the one asked to report. If not, just report the
> > >     199          * error without the detailed information.
> > >     200          */
> > >     201         err = bluefield_edac_readl(priv, MLXBF_SYNDROM,
> > &dram_syndrom);
> > >     202         if (err)
> > >     203                 dev_err(priv->dev, "DRAM syndrom read failed.\n");
> > >
> > > If bluefield_edac_readl() fails then dram_syndrom is uninitialized.
> > >
> > >     204
> > > --> 205         serr = FIELD_GET(MLXBF_SYNDROM__SERR, dram_syndrom);
> > >     206         derr = FIELD_GET(MLXBF_SYNDROM__DERR, dram_syndrom);
> > >     207         syndrom = FIELD_GET(MLXBF_SYNDROM__SYN, dram_syndrom);
> > >
> > 
> > This looks forgotten.
> > 
> > I'm thinking of:
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 8e0736dc2ee0..061149ade8c0 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -8222,8 +8222,7 @@ F:
> > 	Documentation/devicetree/bindings/edac/aspeed-sdram-edac.txt
> >  F:	drivers/edac/aspeed_edac.c
> > 
> >  EDAC-BLUEFIELD
> > -M:	Shravan Kumar Ramani <shravankr@...dia.com>
> > -S:	Supported
> > +S:	Orphan
> >  F:	drivers/edac/bluefield_edac.c
> > 
> >  EDAC-CALXEDA
> > 
> > but lemme Cc people who have touched this recently first.
> > 
> > Thx.
> > 
> > --
> > Regards/Gruss,
> >     Boris.
> > 
> > https://people.kernel.org/tglx/notes-about-netiquette
> 
> Shravan and I will investigate solving this 'smatch' warning.
> 
> I setup a recent linux-next repo, installed the sparse package, and downloaded + built the smatch tool.  
> I execute the following command to run 'smatch' over the entire linux kernel, including bluefield_edac.c:
>    $ make CHECK="/path/to/smatch -p=kernel --file-output" C=1 -j64
> 
> I don't see any warnings related to the bluefield_edac.c module. That is, the generated file 
> "drivers/edac/bluefield_edac.c.smatch" has size 0, and thus no content.
> 
> What is the recommended command line options to pass to Linux kernel build?
> Is there something I am missing here?

The actual issue is the first "return -1" in secure_readl().

You need build the cross function database to see the warning in Smatch.
In fact you'd need to build it twice because it's two functions away.
It's not hard to build the database but it takes a few hours.

~/progs/smatch/release/smatch_scripts/build_kernel_data.sh

I wouldn't bother trying to silence the warning.  We don't worry about
false positives.  The only thing to consider is the kernel code.

Probably bluefield_edac_readl() can't actually fail in real life can
it?  There are a lot of functions like that in the kernel.  If reading
from the PCI stops working, there is nothing you can do in the software
to recover from that, you need to buy new hardware.  If you want I can
just add bluefield_edac_readl() to the list of functions which don't
actually fail.

regards,
dan carpenter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ