[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140411131442.GA11636@pd.tnic>
Date: Fri, 11 Apr 2014 15:14:42 +0200
From: Borislav Petkov <bp@...en8.de>
To: Michal Simek <monstr@...str.eu>
Cc: Punnaiah Choudary <kpc528@...il.com>,
Rob Herring <robherring2@...il.com>,
Punnaiah Choudary Kalluri
<punnaiah.choudary.kalluri@...inx.com>,
Doug Thompson <dougthompson@...ssion.com>,
"devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
linux-edac@...r.kernel.org, Michal Simek <michal.simek@...inx.com>,
Rob Herring <robh+dt@...nel.org>,
Pawel Moll <pawel.moll@....com>,
Mark Rutland <mark.rutland@....com>,
Ian Campbell <ijc+devicetree@...lion.org.uk>,
Kumar Gala <galak@...eaurora.org>,
Rob Landley <rob@...dley.net>,
punnaiah choudary kalluri <kalluripunnaiahchoudary@...il.com>,
punnaiah choudary kalluri <punnaia@...inx.com>,
Russell King <linux@....linux.org.uk>
Subject: Re: [RFC PATCH] edac: add support for ARM PL310 L2 cache parity
On Thu, Apr 10, 2014 at 12:09:03PM +0200, Michal Simek wrote:
> The question here is. This driver is just reporting problem through
> edac interface which is counting that errors and provide an unified
> way how to report problems.
Yes, normally you can use edac for reporting and error counting. But,
if, as I said earlier, it is easier to solve your issue of having two
entities touch one hardware and synchronizing around it is too much,
just for this one case, you can simply report the errors with simple
printk, without the edac interface.
This is why I was asking the practical question of why do you even need
the edac interface? If it is only for reporting, use printk and solve
the problem of having two drivers.
> Maybe as you said we don't need to use edac interface at all but by
> design because every error means that there is the problem and error
> should be reported and system should be reset because we just don't
> know where the problem is. We know that we have a problem.
>
> The question also is if we should execute any code because the problem
> can be with instructions and system should just reset.
>
> Isn't there any security issue that even executing any code is a
> problem?
Well, this is up to you to answer. If an UE (Uncorrectable Error) causes
data to get corrupted on your system, which, as a result, corrupts
visible state which lands on storage, you definitely want to stop
executing any code. x86 deals very rigorously with errors like those
by running an exception handler, on AMD there's also this thing called
syncflood which stops any execution and a warm reset happens.
So you have to think hard what those UEs cause on your systems and only
then act accordingly. If something bad like the above happens, the last
thing you want to do is report them to dmesg.
HTH.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists