lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 18 May 2010 13:44:01 -0300
From:	Mauro Carvalho Chehab <mchehab@...hat.com>
To:	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>
CC:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	bluesmoke-devel@...ts.sourceforge.net,
	Linux Edac Mailing List <linux-edac@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	Ben Woodard <woodard@...hat.com>,
	Matt Domsch <Matt_Domsch@...l.com>,
	Doug Thompson <dougthompson@...ssion.com>,
	Borislav Petkov <bp@...64.org>,
	Tony Luck <tony.luck@...el.com>,
	Brent Young <brent.young@...el.com>
Subject: Re: Hardware Error Kernel Mini-Summit

Hidetoshi Seto wrote:
> (2010/05/18 3:23), Mauro Carvalho Chehab wrote:
>> During the last LF Collaboration Summit, we've done a mini-summit [1],
>> intended to improve the hardware error detection in kernel, currently
>> provided by MCE and EDAC subsystems.
>>
>> The idea of this mini-summit came up after Thomas Gleixner and Ingo
>> Molnar suggestions that edac and mce should converge into an error
>> subsystem.
>>
>> I'm enclosing the minutes of the meeting, in order to allow it to be
>> reviewed by other kernel hackers that are interested on the theme but
>> unfortunately couldn't come to the meeting.
>>
>> Btw, during the meeting, it were decided that EDAC ML could better work
>> if moved to vger, so I'm copying here both the old and the new edac
>> mailing lists.
>>
>> [1] http://events.linuxfoundation.org/lfcs2010/edac
>>
>> ---
> 
> Thank you very much for providing this report.
> 
> I agree that we should have a well organized error subsystem that
> covers all error sources in the system and that provides enough
> simple and powerful API for users. As one of interested absentee,
> I think I could be of some help to you (e.g. x86 low level).

Thank you for your offer. Any help is welcome.
>
> It might be off-topic here, but I'd like to point that you missed
> the presence of PCIe AER subsystem that handle hardware errors on
> PCIe devices nowadays (It works well on ppc, x86 and so on).
> Given that APEI also covers PCIe errors and that some system can
> map MC registers to PCI configuration space, I think there is no
> way for the new error subsystem to ignore I/O device errors while
> it care errors on CPU/memory and cooperate with APEI.

Yes, it makes sense to integrate also PCIe AER subystem. IMO, the first
step is to provide an error core integrated to perf, and then start
integrating the several error systems around it.

-- 

Cheers,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ