linux-kernel - Re: [PATCH] x86/mcheck/therm_throt.c: Don't log power limit and package level thermal throttle event in mce log

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Tue, 6 Dec 2011 20:56:30 +0100
From:	Borislav Petkov <bp@...64.org>
To:	Tony Luck <tony.luck@...el.com>
Cc:	"Yu, Fenghua" <fenghua.yu@...el.com>,
	H Peter Anvin <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Brown, Len" <len.brown@...el.com>,
	linux-kernel <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>
Subject: Re: [PATCH] x86/mcheck/therm_throt.c: Don't log power limit and
 package level thermal throttle event in mce log

On Tue, Dec 06, 2011 at 11:26:03AM -0800, Tony Luck wrote:
> On Tue, Dec 6, 2011 at 11:06 AM, Borislav Petkov <bp@...64.org> wrote:
> > I can see all that. Still, I'm questioning the need for those printks. A
> > user application polling the counters is a much better solution, IMHO,
> > than spamming the logs. IOW, is there a strong reason to have this -
> > even ratelimited - information in the logs and unnerve users, or, would
> > it be better to collect this info somewhere queitly and present it only
> > when something requests it?
> 
> Striking the right balance here is hard - if one has a BIOS that set the
> thresholds at "interesting" values - then you certainly don't want to the
> console to be spammed with a lot of useless junk.
> 
> But if there is a real problem - then having someone tell you later that
> you should have been checking some obscure file in /sys to see that
> some thermal/power limit events were being seen may not go over very
> well.

Agreed.

> When we have some comprehensive system health monitoring daemon that
> does check these files, and can be configured to raise suitable
> alerts, then the printks can go away.

Ok, that makes sense, actually. A follow-up: what recovery handling are
you thinking of here, maybe force-suspend the box or disable boosting or
whatever?

All I'm saying is, how does one take care of the real problem you
mention above? I hope you're seeing my point here: I'm simply
questioning the fact whether printk's are optimal here.

But, before we completely drift off, to answer your original question:
I'm fine with the patch, it is Intel-only anyway so if you guys feel it
is a step in the right direction, you can have my ACK.

The printks story sounds like something we'll not be solving today
anyway, so... :-)

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/