[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140731113324.GB4375@pd.tnic>
Date: Thu, 31 Jul 2014 13:33:24 +0200
From: Borislav Petkov <bp@...en8.de>
To: Punnaiah Choudary Kalluri <punnaiah.choudary.kalluri@...inx.com>
Cc: "dougthompson@...ssion.com" <dougthompson@...ssion.com>,
"robh+dt@...nel.org" <robh+dt@...nel.org>,
"pawel.moll@....com" <pawel.moll@....com>,
Michal Simek <michals@...inx.com>,
"mark.rutland@....com" <mark.rutland@....com>,
"ijc+devicetree@...lion.org.uk" <ijc+devicetree@...lion.org.uk>,
Kumar Gala <galak@...eaurora.org>,
Rob Landley <rob@...dley.net>,
"devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Punnaiah Choudary <kpc528@...il.com>,
Anirudha Sarangi <anirudh@...inx.com>,
Srikanth Vemula <svemula@...inx.com>
Subject: Re: [RFC PATCH v3] edac: synps: Added EDAC support for zynq ddr ecc
controller
On Wed, Jul 30, 2014 at 03:41:47PM +0000, Punnaiah Choudary Kalluri wrote:
> >So you're telling me that you want one edac driver for *two* memory
> >controllers which can be present on a single system *at* *the* *same*
> >*time*? Is that it?
>
> Yes.
Oh, this'll be fun. :-P
> >
> >How exactly is that topology supposed to look like, work, etc, etc? What
> >kind of error reporting do you imagine you want to do with EDAC?
>
> Zynq (All programmable SOC) contains a dual core ARM cortex A9 based processing
> System(PS) and Xilinx programmable logic(PL) in a single device.
>
> Assume the application is a broadcast camera. The design for this system use PS as
> Control plane and use PL as data plane for processing the video data. So, the design
> may have two different memory controllers one in PS and another one in PL.
> PS is running with Linux OS and PL doesn't have the OS and it is controlled by the
> OS running on PS.
Ok.
> Now the requirement is to develop a monitoring system for all the
> hardware failures Including ddr ecc errors. Since the broadcast camera
> involves more data processing and DDR memory access, there is a high
> probability of getting ddr ecc errors over the period.
So you're saying the camera has memory with ECC? People really pay
attention on getting error free video frames? :-)
Or this is just an example? You're saying the PL will need ECC for some
applications?
> So, the user should be intimated with these errors when they occur and if the error rate
> is high, then the user can consider the preventive methods. Without this error reporting
> mechanism it is difficult to debug the issues like memory corruption, kernel oops which
> may occur due to ddr ecc failures.
>
> Since, the memory controllers are different, it need two edac drivers for reporting the ecc
> errors and also maintaining the statistics of that particular memory controller. With the current
> framework, there is a chance that both the drivers get mc_num as zero and malfunction.
> Assume the code for the two drivers looks like below
>
> Driver 1:
> mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
> sizeof(struct ctrl1_drvdata));
>
> Driver 2:
> mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
> sizeof(struct ctrl2_drvdata));
>
> Issue:
> Since driver is providing the mc_num to framework, now there is chance that only one device active as
> both the drivers claiming the same number.
>
> Solution 1:
> Keep two drivers in single file and use static global variable for tracking the mc_num. This solution looks
> good but the drivers may not be generic as these driver would be in a zynq_edac.c file. So others may not
> reuse these drivers
Ok, I think I know what you mean. And this architecture works just fine.
On x86 we have one EDAC instance per memory controller so on a multinode
machine with multiple memory controllers, we do edac_mc_alloc() per
memory controller.
For examples, see amd64_edac::init_one_instance() and sb_edac should
have this too. So you basically have a local array of instances which
you allocate and setup.
If someone wants to reuse the driver, then we can talk about this later,
when the time comes.
For now I think you should put both PS and PL stuff in one file,
zynq_edac or so.
Any issues with this design?
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists