[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xgckna5pkrh6rgw4sz7hct6akndpewjmqzsep4lz23j6qdme33@f2y7gdqp2ihw>
Date: Tue, 11 Jun 2024 19:57:26 +0300
From: Serge Semin <fancer.lancer@...il.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Dinh Nguyen <dinguyen@...nel.org>, Michal Simek <michal.simek@....com>,
Alexander Stein <alexander.stein@...tq-group.com>, Tony Luck <tony.luck@...el.com>,
James Morse <james.morse@....com>, Mauro Carvalho Chehab <mchehab@...nel.org>,
Robert Richter <rric@...nel.org>, Manish Narani <manish.narani@...inx.com>,
Punnaiah Choudary Kalluri <punnaiah.choudary.kalluri@...inx.com>, Arnd Bergmann <arnd@...db.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>, linux-arm-kernel@...ts.infradead.org, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 02/20] EDAC/synopsys: Fix generic device type
detection procedure
On Mon, Jun 10, 2024 at 10:00:37AM +0200, Borislav Petkov wrote:
> On Wed, Jun 05, 2024 at 01:11:27AM +0300, Serge Semin wrote:
> > As I said because dev_type is the memory DRAM chips type (individual
> > DRAM chip data bus width), and not the entire DQ-bus width or its
> > currently active part. Even from that perspective the function name
> > and the subsequent return value utilization is incorrect.
>
> Well, maybe the author misunderstood it but the result of this goes to
> sysfs:
>
> dimm->dtype = p_data->get_dtype(priv->baseaddr);
>
> which is in Documentation/ABI/testing/sysfs-devices-edac:
>
> What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_dev_type
> Date: April 2012
> Contact: Mauro Carvalho Chehab <mchehab+samsung@...nel.org>
> linux-edac@...r.kernel.org
> Description: This attribute file will display what type of DRAM device is
> being utilized on this DIMM (x1, x2, x4, x8, ...).
>
> So you'd need to fix the comment above zynqmp_get_dtype() or I can do so
> too while applying.
Right. I missed the comment indeed. Thanks for spotting that. If it
won't be that much of a burden please fix it on merging the patch in.
If I need to release v7 with this patch included, then I'll fix the
comment myself.
>
> > First of all, not that much of the kinds.
>
> What does that mean: "not that much of the kinds"?
The answer was following that phrase. Just two devices: ZynqMP DDRC
and a DW uMCTL v3.80a-based DDRC available on the Dinh Nguyen's
device. Seeing there were no much changes provided to the driver to
support that controller, the controller must have been compatible with
the Xilinx ZynqMP DDRC in the vast majority of the DW uMCTL2 DDRC
parameters/features.
>
> > Just Xilinx ZynqMP DDRC (based on the DW uMCTL 2.40a IP-core) and some
> > version of DW uMCTL 3.80a being possessed by Dinh Nguyen and, by
> > a lucky coincident, turned to be mainly compatibly with the Xilinx
> > ZynqMP DDR controller.
>
> Then Dinh better holler here what the story is.
>
> > > > 32 or 64. At the same time the bus width mode (MSTR.data_bus_width)
> > > > doesn't change the ECC feature availability. Thus it was wrong to
> > > > determine the ECC state with respect to the DQ-bus width mode.
> >
> > Sorry, but this part doesn't miss anything.
>
> Gramatically:
>
> "The IP-core reference manual says in [1,2] that the ECC support can't
> be enabled during the IP-core synthesizes for the DRAM data bus widths
> other than 16,..."
>
> "synthesizes" looks wrong.
>
> It either needs to be
>
> "... be enabled *while* the IP-core synthesizes for the DRAM..." which
> still doesn't make too much sense.
>
> Or
>
> "... be enabled during the IP-core *synthesis* for the DRAM..."
>
> I don't know what you mean with that "synthesizes" thing.
But you know what it means if "synthesis" would have been utilized, right?
If no, I'll explain. If yes, then you're right. My mistake. I confused two
letters. I'll fix it in v7 should the patch need to be included there.
>
> > First of all, MSTR.data_bus_width field can have only one of the next
> > three values: 0x1, 0x2 and 0x3. All of them are handled in
> > zynqmp_get_dtype(). So in the current (incorrect) implementation it
> > will never return DEV_UNKNOWN.
> >
> > Secondly, dimm->dtype isn't utilized for something significant in the
> > EDAC subsystem, but is just exposed to the user-space via the dev_type
> > sysfs node.
>
> See above.
>
> > So based on that my bet is that since the incorrect code didn't affect
> > the main driver functionality and since the dimm->dtype is just
> > exposed to user-space, the bug has been living just fine unnoticed up
> > until I started digging into the original DW uMCTL2 HW-manuals,
> > started studying the driver code, and decided to convert the driver to
> > supporting generic version of the DW uMCTL2 controller (not only the
> > Xilinx version of it). That's what this series and the next two ones
> > are about - about converting the driver to supporting truly generic DW
> > uMCTL controllers.
>
> I absolutely don't have a problem with that - good idea!
>
> However, we don't break machines and don't introduce regressions.
Who would have argued.)
>
> > > Can those be freely accessed?
> > >
> > > If not, you should say so.
> >
> > No, they can't be.
>
> Then you don't need to mention them.
Well, I see it otherwise. If you posses the databook then by using the
references you can find the info there straight away with no need in
struggling through the _1.5K_ pages file. If you don't have one, then
you can just skip that part of the log.
So I'd rather leave the refs be in the log.
>
> >
> > >
> > > > Fixes: b500b4a029d5 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
> > >
> > > So this commit is in 4.20.
> > >
> > > Does that mean that this fix needs to get backported to all stable
> > > kernels?
> >
> > It's up to the stable maintainers to decide.
>
> Haha, you're funny. How can the stable maintainers know whether each
> patch that has Fixes: tags is stable material?
>
> Nope, that's up to the maintainer to decide.
... and the review committee, and the linux-kernel list members may
participate in the discussion too. But that's not the point here,
right?
Anyway if you wished to know my opinion, then really I don't have a
strong one in this patch regard. From one side the patch does fix an
"oh, that's not good" issue. That's why I has added the Fixes tag. On
the other hand the problem has been here unnoticed for years and
nobody cared. The only parts the incorrect method implementation has
affected was a wrong value returned to the user-space via the
sysfs-node, and the first part of the ECC-availability test procedure
which has turned to be redundant anyway since the zynqmp_get_dtype()
method never returns DEV_UNKNOWN. So my conclusion is the same. It's up
to the maintainers to decide.
>
> > I've tested it on the devices with DW uMCTL 2.51a + DDR3 memory and DW
> > uMCTL 3.10a + DDR4 memory. I am sure this will work for Xilinx ZynqMP
> > too, especially seeing we've already got the Shubhrajyoti Datta Rb
> > tag:
>
> Yes, I've asked him to review that driver because this is not something
> I have or use and so on...
As you can see, I do and of two IP-core major versions (and plenty of
the DW uMCTL2 IP-core databooks). So should you need some help with
testing the bits coming for the Synopsys DW uMCTL2 EDAC driver, just
send a ping to me. I'll test them out.
-Serge(y)
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists