linux-kernel - Re: [PATCH v2] perf/x86/intel: ignore CondChgd bit to avoid false NMI handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53B12CB3.5050508@jp.fujitsu.com>
Date:	Mon, 30 Jun 2014 18:24:03 +0900
From:	"HATAYAMA, Daisuke" <d.hatayama@...fujitsu.com>
To:	hpa@...or.com, ak@...ux.intel.com
CC:	Don Zickus <dzickus@...hat.com>, matt@...sole-pimps.org,
	peterz@...radead.org, acme@...nel.org, mingo@...hat.com,
	paulus@...ba.org, tglx@...utronix.de, x86@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] perf/x86/intel: ignore CondChgd bit to avoid false
 NMI handling

Hello,

(2014/06/17 0:30), Don Zickus wrote:
> On Fri, Jun 13, 2014 at 05:44:37PM +0900, HATAYAMA Daisuke wrote:
>> Currently, a NMI handler for NMI watchdog may falsely handle any NMI
>> signaled for different purpose if CondChgd bit in
>> MSR_CORE_PERF_GLOBAL_STATUS MSR is set.
>>
>> This commit deals with the issue simply by ignoring CondChgd bit.
>>
>> Here is explanation in detail.
>>
>> On x86 NMI watchdog uses performance monitoring feature to
>> periodically signal NMI each time performance counter gets overflowed.
>>
>> intel_pmu_handle_irq() is called as a NMI_LOCAL handler from a NMI
>> handler of NMI watchdog, perf_event_nmi_handler(). It identifies an
>> owner of a given NMI by looking at overflow status bits in
>> MSR_CORE_PERF_GLOBAL_STATUS MSR. If some of the bits are set, then it
>> handles the given NMI as its own NMI.
>>
>> The problem is that the intel_pmu_handle_irq() doesn't distinguish
>> CondChgd bit from other bits. Unlike the other status bits, CondChgd
>> bit doesn't represent overflow status for performance counters. Thus,
>> CondChgd bit cannot be thought of as a mark indicating a given NMI is
>> NMI watchdog's. As a result, if CondChgd bit is set, any NMI is
>> falsely handled by the NMI handler of NMI watchdog. Also, if type of
>> the falsely handled NMI is either NMI_UNKNOWN, NMI_SERR or
>> NMI_IO_CHECK, the corresponding action is never performed until
>> CondChgd bit is cleared.
>>
>> I noticed this behavior on systems with Ivy Bridge processors: Intel
>> Xeon CPU E5-2630 v2 and Intel Xeon CPU E7-8890 v2. On both systems,
>> CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR has already been set
>> in the beginning at boot. Then the CondChgd bit is immediately cleared
>> by next wrmsr to MSR_CORE_PERF_GLOBAL_CTRL MSR and appears to remain
>> 0.
>>
>> On the other hand, on older processors such as Nehalem, Xeon E7540,
>> CondChgd bit is not set in the beginning at boot.
>>
>> I'm not sure about exact behavior of CondChgd bit, in particular when
>> this bit is set. Although I read Intel System Programmer's Manual to
>> figure out that, the descriptions I found are:
>>
>>    In 18.9.1:
>>
>>    "The MSR_PERF_GLOBAL_STATUS MSR also provides a ‘sticky bit’ to
>>     indicate changes to the state of performancmonitoring hardware"
>>
>>    In Table 35-2 IA-32 Architectural MSRs
>>
>>    63 CondChg: status bits of this register has changed.
>>
>> These are different from the bahviour I see on the actual system as I
>> explained above.
>>
>> At least, I think ignoring CondChgd bit should be enough for NMI
>> watchdog perspective.
>
> As I said in a previous email, I ran into a similar problem and was going
> to solve it by zeroing out all the registers on init (which probably would
> have upset Peter :-) ).  This is a smaller solution and seems ok.  The
> only downside is it is called in the nmi handler.
>
>
> I am working with our customer to try and talk with Intel why this bit is
> set to begin with.  Our customer says their BIOS doesn't use the PMU
> during boot so it wasn't clear why this is now set on IVBs (though I don't
> see them on Intel whitebox IVBs).
>

I'm also interested in the behaviour of CondChgd bit on Ivy Bridge processors.

Do you know something about this behaviour?

-- 
Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/