Jason pointed out that kgdb no longer works with new nmi-watchdog. Don found that P4 PMU reads CCCR register instead of counter itself (in attempt to catch unflagged event). Testing high bit of counter didn't resolve the situation in a complete way, the kgdb became working but perf top still fails. Lets read the all valuable bits from the counter and check them for being oveflowed. At least this patch eliminates wrong read operation though perf top failure still requires to be investigated. v2: Call checking_wrmsrl only if needed. v3: Check the valuable bits. v4: Leave p4_pmu_clear_cccr_ovf early if P4_CCCR_OVF flag is set. Reported-by: Jason Wessel Reported-by: Don Zickus Signed-off-by: Cyrill Gorcunov CC: Lin Ming CC: Stephane Eranian CC: Peter Zijlstra --- Don, I know this one still doesn't improve the situation on real p4 cpu with perf top if nmi-watchdog enabled, but at least lets put it public for more wide review, maybe some idea will appear :) arch/x86/include/asm/perf_event_p4.h | 3 +++ arch/x86/kernel/cpu/perf_event_p4.c | 28 +++++++++++++++------------- 2 files changed, 18 insertions(+), 13 deletions(-) Index: linux-2.6.git/arch/x86/include/asm/perf_event_p4.h ===================================================================== --- linux-2.6.git.orig/arch/x86/include/asm/perf_event_p4.h +++ linux-2.6.git/arch/x86/include/asm/perf_event_p4.h @@ -20,6 +20,9 @@ #define ARCH_P4_MAX_ESCR (ARCH_P4_TOTAL_ESCR - ARCH_P4_RESERVED_ESCR) #define ARCH_P4_MAX_CCCR (18) +#define ARCH_P4_CNTRVAL_BITS (40) +#define ARCH_P4_CNTRVAL_MASK ((1ULL << ARCH_P4_CNTRVAL_BITS) - 1) + #define P4_ESCR_EVENT_MASK 0x7e000000U #define P4_ESCR_EVENT_SHIFT 25 #define P4_ESCR_EVENTMASK_MASK 0x01fffe00U Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c ===================================================================== --- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event_p4.c +++ linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c @@ -753,19 +753,21 @@ out: static inline int p4_pmu_clear_cccr_ovf(struct hw_perf_event *hwc) { - int overflow = 0; - u32 low, high; + u64 v; - rdmsr(hwc->config_base + hwc->idx, low, high); - - /* we need to check high bit for unflagged overflows */ - if ((low & P4_CCCR_OVF) || !(high & (1 << 31))) { - overflow = 1; - (void)checking_wrmsrl(hwc->config_base + hwc->idx, - ((u64)low) & ~P4_CCCR_OVF); + /* an official way for overflow indication */ + rdmsrl(hwc->config_base + hwc->idx, v); + if (v & P4_CCCR_OVF) { + wrmsrl(hwc->config_base + hwc->idx, v & ~P4_CCCR_OVF); + return 1; } - return overflow; + /* it might be unflagged overflow */ + rdmsrl(hwc->event_base + hwc->idx, v); + if (!(v & ARCH_P4_CNTRVAL_MASK)) + return 1; + + return 0; } static void p4_pmu_disable_pebs(void) @@ -1152,9 +1154,9 @@ static __initconst const struct x86_pmu */ .num_counters = ARCH_P4_MAX_CCCR, .apic = 1, - .cntval_bits = 40, - .cntval_mask = (1ULL << 40) - 1, - .max_period = (1ULL << 39) - 1, + .cntval_bits = ARCH_P4_CNTRVAL_BITS, + .cntval_mask = ARCH_P4_CNTRVAL_MASK, + .max_period = (1ULL << (ARCH_P4_CNTRVAL_BITS - 1)) - 1, .hw_config = p4_hw_config, .schedule_events = p4_pmu_schedule_events, /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/