linux-kernel - Re: [NAK] Re: [PATCH -v2 9/9] ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101025125531.GA6075@elte.hu>
Date:	Mon, 25 Oct 2010 14:55:31 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Huang Ying <ying.huang@...el.com>, Len Brown <lenb@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	Borislav Petkov <petkovbb@...glemail.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, Don Zickus <dzickus@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mauro Carvalho Chehab <mchehab@...hat.com>,
	Arjan van de Ven <arjan@...radead.org>
Subject: Re: [NAK] Re: [PATCH -v2 9/9] ACPI, APEI, Generic Hardware Error
 Source POLL/IRQ/NMI notification type support

* Andi Kleen <andi@...stfloor.org> wrote:

> On Mon, Oct 25, 2010 at 01:15:30PM +0200, Ingo Molnar wrote:
> 
> > > > > einj.c: it's about the 3rd separate 'error injection' concept that got 
> > > > > introduced ...
> > > > 
> > > > EINJ is a true platform feature, not just software feature. We need to support 
> > > > it to debug various hardware error features.
> > > 
> > > Also having multiple error injecting interfaces is a good thing.
> > 
> > It's never a good thing to have separate, vendor dependent interfaces for what 
> > to the user is basically the same conceptual thing!
> 
> Perhaps a simple example (simplified, in practice there are more complications) 
> makes it more clear:
> 
> The memory error handler does different actions depending on what the state the 
> page the error is happening on is in.

What you appear to be arguing for is the ability to inject different types of 
events.

_OF COURSE_ we want that.

Just like we want to be able to _receive_ multiple types of events from wildly 
different hardware and wildly different kernel subsystems ...

Duh.

That desire does not necessiate 'three different injectors' at all. It does not 
necessiate multiple incompatible facilities with random ABIs.

What we want is a single injector facility visible to RAS/hw-testing/etc. apps, and 
a way to pass in attributes that specify the kind of event that we want to trigger.

Also note that you completely ignored the other basis of my objection and NAK: that 
the whole ad-hoc event log export that this code does via the /dev/erst-dbg ABI is 
actively harmful.

> Would it be nice if there was a single great injector that covers everything? Yes 
> Is it realistic? No.

Everyone else working on this area thinks it's realistic, and is in fact working on 
such a facility.

The main thing that is causing confusion here is not the technical viability of such 
a project (it's evidently doable and desirable), but your unwillingness to 
cooperate. If Intel goes into random directions and essentially obstructs upstream 
projects then we wont have this implemented on Intel CPUs sanely and cleanly - 
despite Mauro's best efforts on the Nehalem code.

But you should really not bring that up as some kind of positive argument ...

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/