lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 13 Jan 2009 10:57:46 -0800
From:	Tim Hockin <thockin@...il.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Andi Kleen <ak@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>,
	ying.huang@...el.com, Aaron Durbin <adurbin@...il.com>,
	priyankag@...gle.com
Subject: Re: x86/mce merge, integration hickup + crash, design thoughts

On Tue, Jan 13, 2009 at 9:45 AM, Ingo Molnar <mingo@...e.hu> wrote:
>
> * Andi Kleen <ak@...ux.intel.com> wrote:
>
>> Ingo Molnar wrote:
>>
>>>>> A far more useful design for handling MCE events would be to feed
>>>>> them into printk logging.
>>>> If there's ASCII logging it should be separate from normal printk.
>>>
>>> Well, why?
>>
>> Mostly because the problem is not a kernel issue. Especially large
>> systems with a lot of memory can generate a lot of corrected events (one
>> bit flips in DIMMs are not that uncommon) and it's not good to mix that
>> all up into other kernel messages. It also makes it more clear that it's
>> not a kernel problem, but a hardware problem. I've got feedback over the
>> years that confirm this sight.
>
> Is your argument that syslog is not suitable for the logging of hw events?

I will argue that, yes.

> If that is your argument then the answer is to extend syslog with those
> aspects, instead of widening the quirky /dev based mce ABIs to achieve
> something similar.

I don't like "extend" in this context.  I'd prefer to think of it as a
side-band solution that we need.  And yes, such a solution COULD
obsolete mcelog.  Do you have such a solution done?  Specced?

> If you think that it's suitable then that contradicts your point above.
>
>> [...]
>>
>> None of the points above are real show stoppers for an ASCII interface,
>> but I think with all of this above together considered it's not really
>> an attractive change.
>>
>> I think what could be done is:
>>
>> - Investigate how to make the panic message more information without
>>   adding full decoders.
>>
>> - Implement the default panic timeout method
>>   described above to get automatic on disk logging in common cases.
>>
>> Would that address your concerns?
>
> For me the main blocking point is that mcelog uses a quirky, binary
> side-channel instead of using our main ASCII based logging abstraction
> that we have in Linux: printk + syslog.

It's not particularly quirky - it's a fixed-size ringbuffer.  You're
trying to spin it as some eccentric interface designed by tripped-out
hippies, but it's not really.  It's just a simple piece of plumbing
for a very specific purpose, which exists because there was no better
answer at the time (is there one now?)

> That is a high level argument, while most of your arguments are low level.
> I dont think you can understand my argument if you concentrate on the low
> level only.
>
> We need to resolve this instead of expanding the broken /dev/mcelog
> interface. If we expand the broken interface first then that removes all
> the incentives to enhance the primary logging facility of Linux in this
> area.

I'm 100% on board with that and will even help staff the effort.  This
is something that is VERY HIGHLY desired here.  I already have a
couple peopel looking at this and other HW-error reporting issues.

> Anyway, this merge window has been very crowded in the x86 space already,
> and the MCE topic is not particularly super-important to have right now
> either, so lets skip it for this cycle so that we have more time to
> cleanly work out these details.

Thanks!

> Let me know if there are must-have fixes in it and we can cherry-pick it
> over into x86/urgent.

Tim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ