[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <43F901BD926A4E43B106BF17856F075501A22B5563@orsmsx508.amr.corp.intel.com>
Date: Tue, 6 Dec 2011 11:27:34 -0800
From: "Yu, Fenghua" <fenghua.yu@...el.com>
To: Borislav Petkov <bp@...64.org>
CC: "Luck, Tony" <tony.luck@...el.com>, H Peter Anvin <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
"Brown, Len" <len.brown@...el.com>,
linux-kernel <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>
Subject: RE: [PATCH] x86/mcheck/therm_throt.c: Don't log power limit and
package level thermal throttle event in mce log
> -----Original Message-----
> From: Borislav Petkov [mailto:bp@...64.org]
> Sent: Tuesday, December 06, 2011 11:07 AM
> To: Yu, Fenghua
> Cc: Luck, Tony; H Peter Anvin; Thomas Gleixner; Ingo Molnar; Andrew
> Morton; Brown, Len; linux-kernel; x86
> Subject: Re: [PATCH] x86/mcheck/therm_throt.c: Don't log power limit
> and package level thermal throttle event in mce log
>
> On Tue, Dec 06, 2011 at 09:48:41AM -0800, Yu, Fenghua wrote:
>
> > The printk is one way to notify users about the power limit and
> > thermal throttle. The printk only dumps the events in an interval
> > (300*HZ).
> >
> > Another way is to count the events in
> > /sys/devices/system/cpu/cpu#/thermal_throttle. In this way, kernel
> > logs every interrupt on any cpu into respective counters. User
> > application can poll the counters and get more accurate and timely
> > information for the events.
> >
> > As explained in this patch, core level thermal throttle is still
> > logged in mcelog for legacy reason after this patch is applied.
>
> I can see all that. Still, I'm questioning the need for those printks.
> A
> user application polling the counters is a much better solution, IMHO,
> than spamming the logs. IOW, is there a strong reason to have this -
> even ratelimited - information in the logs and unnerve users, or, would
> it be better to collect this info somewhere queitly and present it only
> when something requests it?
The printks are legacy code. People may have expect that info already. Removing them may cause
Others' complain.
Plus thermal throttle and power limit shouldn't happen too offen and/or for too long time. So the printks shouldn't print a lot of times in normal case and should spam dmesg. If the printks
do print a lot of times for a long period, user does need to pay attention to the events. One
concrete example is like what happened on x220. Please note the information printed by the
printks were noticed and referred, i.e. the info is useful in this case. From that perspective, the printks are necessary.
Having said that, the purpose of THIS patch is to remove thermal throttle and power limit events from mcelog to avoid people thinking scary hardware errors. If the printks are thought unnecessary, that should be in another patch.
Thanks.
-Fenghua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists