[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100902192627.GB5538@lenovo>
Date: Thu, 2 Sep 2010 23:26:27 +0400
From: Cyrill Gorcunov <gorcunov@...il.com>
To: Don Zickus <dzickus@...hat.com>
Cc: mingo@...e.hu, peterz@...radead.org, robert.richter@....com,
fweisbec@...il.com, linux-kernel@...r.kernel.org,
ying.huang@...el.com, ming.m.lin@...el.com, yinghai@...nel.org,
andi@...stfloor.org, eranian@...gle.com
Subject: Re: [PATCH 1/3] perf, x86: Fix accidentally ack'ing a second event
on intel perf counter
On Thu, Sep 02, 2010 at 03:07:47PM -0400, Don Zickus wrote:
> During testing of a patch to stop having the perf subsytem swallow nmis,
> it was uncovered that Nehalem boxes were randomly getting unknown nmis
> when using the perf tool.
>
> Moving the ack'ing of the PMI closer to when we get the status allows
> the hardware to properly re-set the PMU bit signaling another PMI was
> triggered during the processing of the first PMI. This allows the new
> logic for dealing with the shortcomings of multiple PMIs to handle the
> extra NMI by 'eat'ing it later.
>
> Now one can wonder why are we getting a second PMI when we disable all
> the PMUs in the begining of the NMI handler to prevent such a case, for
> that I do not know. But I know the fix below helps deal with this quirk.
>
> Tested on multiple Nehalems where the problem was occuring. With the
> patch, the code now loops a second time to handle the second PMI (whereas
> before it was not).
>
> Signed-off-by: Don Zickus <dzickus@...hat.com>
> ---
> arch/x86/kernel/cpu/perf_event_intel.c | 6 ++----
> 1 files changed, 2 insertions(+), 4 deletions(-)
>
Hi Don,
I might be missing something (I'm sure I'm actually) so enlighten me
a bit please
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index d8d86d0..1297bf1 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -712,7 +712,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
> struct perf_sample_data data;
> struct cpu_hw_events *cpuc;
> int bit, loops;
> - u64 ack, status;
> + u64 status;
>
Lets assume 1 counters is triggered and global bit is set as well
we have here
status = intel_pmu_get_status();
> perf_sample_data_init(&data, 0);
>
> @@ -728,6 +728,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>
> loops = 0;
> again:
> + intel_pmu_ack_status(status);
So here we write just being read value back to CTRL register and _if_ new
overflow happened in this window we've cleared it without processing.
> if (++loops > 100) {
> WARN_ONCE(1, "perfevents: irq loop stuck!\n");
> perf_event_print_debug();
> @@ -736,7 +737,6 @@ again:
> }
>
> inc_irq_stat(apic_perf_irqs);
> - ack = status;
>
> intel_pmu_lbr_read();
>
> @@ -761,8 +761,6 @@ again:
> x86_pmu_stop(event);
> }
>
> - intel_pmu_ack_status(ack);
> -
Here we cleared bits in "status" variable and then we read
status register again without cleaning bits in real physical
register which confuses me.
> /*
> * Repeat if there is more work to be done:
> */
> --
> 1.7.2.2
>
-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists