[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49EC3AB5.5070902@jp.fujitsu.com>
Date: Mon, 20 Apr 2009 18:04:53 +0900
From: Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>
To: Andi Kleen <andi@...stfloor.org>
CC: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
Andi Kleen <ak@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RESEND][PATCH -tip 2/3] x86, mce: Revert "add mce=nopoll option
to disable timer polling"
Andi Kleen wrote:
> Hidetoshi Seto <seto.hidetoshi@...fujitsu.com> writes:
>
>> Disabling only polling but not cmci is pointless setting.
>> Instead of "mce=nopoll" which tend to be paired with cmci disablement,
>> it rather make sense to have a "mce=ignore_ce" option that disable
>> both of polling and cmci at once. A patch for this new implementation
>> will follow this reverting patch.
>>
>> OTOH, once booted, we can disable polling by setting check_interval
>> to 0, but there are no mention about the fact. Later Andi will post
>> updated documents that can respond this issue.
>
> I still think that patch has bad semantics because you leave around
> the events in the machine check registers and never clear
> them. Especially with MCA recovery that has very unfortunate side
> effects -- it means the OVER bit will be set and a in principle
> recoverable MCA will require a panic. Even without MCA recovery it has
> similar problems and will lead to confusing log output for non CE
> MCAs.
>
> I think a patch to not log corrected errors would be reasonable,
> but you still need to clear the events from the machine check
> banks at least.
>
> So I would recommend you add a mce=dont_log_ce or somesuch
> that just guards the mce_log() call in machine_check_poll()
I suppose there are two possible situations:
1) There is a agent checking/clearing corrected errors
(such as BIOS) other than OS.
In this case, clearing MSRs by OS is not applicable.
So ignore_ce is better option here.
2) There is no agent checking/clearing corrected errors.
User just want to suppress logs of corrected errors.
In this case, dont_log_ce would be better option.
(Or adding filter to mcelog would be another solution)
I don't mind adding three options (no_cmci/ignore_ce/dont_log_ce)
at once. I'll rework 3/3 of this series to do so.
> Also for your use case really the better way would be to use
> some way to let the firmware communicate that it doesn't want the OS
> to log.
Yes. However AFAIK there is no way to do it yet.
> Also BTW before adding new features like this it would be a good
> idea to first add the bug fixes I posted two weeks ago.
>
> -Andi
The original of this repost were posted about three weeks ago (Apr.2)...
I think your patches will go smoothly if my revert patches added before
them.
BTW, could you give me your Acked-by on this 2/3 too?
Thanks,
H.Seto
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists