[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20140612.160011.167980216.d.hatayama@jp.fujitsu.com>
Date: Thu, 12 Jun 2014 16:00:11 +0900 (JST)
From: HATAYAMA Daisuke <d.hatayama@...fujitsu.com>
To: peterz@...radead.org
Cc: acme@...nel.org, mingo@...hat.com, paulus@...ba.org, hpa@...or.com,
tglx@...utronix.de, x86@...nel.org, linux-kernel@...r.kernel.org,
matt@...sole-pimps.org
Subject: Re: [PATCH] perf/x86/intel: ignore CondChgd bit to avoid false NMI
handling
From: Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] perf/x86/intel: ignore CondChgd bit to avoid false NMI handling
Date: Wed, 11 Jun 2014 13:54:13 +0200
> On Wed, Jun 11, 2014 at 10:54:48AM +0200, Peter Zijlstra wrote:
>> > I'm not sure about exact behavior of CondChgd bit, in particular when
>> > this bit is set. Although I read Intel System Programmer's Manual to
>> > figure out but I have yet completed that. At least, I think ignoring
>> > CondChgd bit should be enough for NMI watchdog perspective.
>>
>> So yes, the SDM lists the bit as existing but never once mentions it
>> outside of that, and its been doing that at least back to 2008.
>>
>> Ooh, I found it:
>>
>> "The IA32_PERF_GLOBAL_STATUS MSR also provides a ‘sticky bit’ to
>> indicate changes to the state of performance monitoring hardware (see
>> Figure 18-29)."
>>
>> Which is of course completely useless, not to mention inconsistent with
>> the later CondChgd name.
>>
>> HPA, can you explain wtf that bit does and why hatayama-san's ivb feels
>> like having that set on boot?
>
> Matt found in the MSR listing for GLOBAL_STATUS:
>
> 63 CondChg: status bits of this register has changed. If CPUID.0AH: EAX[7:0] > 0
>
> Which brings us to a grand total of 3 different names for this bit.
>
> If it indeed does what it says on the tin, set every time the status
> changes its like the most useless bit ever and I wonder why people
> bothered to spend silicon on it.
>
Yes, I didn't mention in patch description this, but I reached the
same conclusion. The description confuses me because the desciption
and the behaviour of CondChg bit I see on the actual system is not
coincide.
Also, I checked cpuid on the system with Neharlem processor where I
have never seen CondChg bit is set.
[root@...alhost ~]# ./cpuid -r
CPU 0:
0x00000000 0x00: eax=0x0000000b ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
0x00000001 0x00: eax=0x000206e6 ebx=0x40200800 ecx=0x00bce3bd edx=0xbfebfbff
<snip>
0x0000000a 0x00: eax=0x07300403 ebx=0x00000044 ecx=0x00000000 edx=0x00000603
^^^^^^^^^^^^^^
So, cpuid tells that CondChg bit is supported on this processor, too.
> In any case, the proposed patch seems fine, just needs a better
> changelog.
>
I see.
I'll write that the problem is that any NMI could be robbed by NMI
watchdog explicitly. Now only patch title says this explicitly. This
is your first comment.
About CondChgd bit, I cannot write more than I see on actual
system. If it's necessary to describe more about CondChgd bit, it
would be appreciated if someone tell me more information about it.
Thanks.
HATAYAMA, Daisuke
Powered by blists - more mailing lists