linux-kernel - Re: [PATCH 1/3] perf, x86: Fix accidentally ack'ing a second event on intel perf counter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100902192627.GB5538@lenovo>
Date:	Thu, 2 Sep 2010 23:26:27 +0400
From:	Cyrill Gorcunov <gorcunov@...il.com>
To:	Don Zickus <dzickus@...hat.com>
Cc:	mingo@...e.hu, peterz@...radead.org, robert.richter@....com,
	fweisbec@...il.com, linux-kernel@...r.kernel.org,
	ying.huang@...el.com, ming.m.lin@...el.com, yinghai@...nel.org,
	andi@...stfloor.org, eranian@...gle.com
Subject: Re: [PATCH 1/3] perf, x86: Fix accidentally ack'ing a second event
	on intel perf counter

On Thu, Sep 02, 2010 at 03:07:47PM -0400, Don Zickus wrote:
> During testing of a patch to stop having the perf subsytem swallow nmis,
> it was uncovered that Nehalem boxes were randomly getting unknown nmis
> when using the perf tool.
> 
> Moving the ack'ing of the PMI closer to when we get the status allows
> the hardware to properly re-set the PMU bit signaling another PMI was
> triggered during the processing of the first PMI.  This allows the new
> logic for dealing with the shortcomings of multiple PMIs to handle the
> extra NMI by 'eat'ing it later.
> 
> Now one can wonder why are we getting a second PMI when we disable all
> the PMUs in the begining of the NMI handler to prevent such a case, for
> that I do not know.  But I know the fix below helps deal with this quirk.
> 
> Tested on multiple Nehalems where the problem was occuring.  With the
> patch, the code now loops a second time to handle the second PMI (whereas
> before it was not).
> 
> Signed-off-by: Don Zickus <dzickus@...hat.com>
> ---
>  arch/x86/kernel/cpu/perf_event_intel.c |    6 ++----
>  1 files changed, 2 insertions(+), 4 deletions(-)
> 

Hi Don,

I might be missing something (I'm sure I'm actually) so enlighten me
a bit please

> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index d8d86d0..1297bf1 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -712,7 +712,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>  	struct perf_sample_data data;
>  	struct cpu_hw_events *cpuc;
>  	int bit, loops;
> -	u64 ack, status;
> +	u64 status;
>  

Lets assume 1 counters is triggered and global bit is set as well

we have here

	status = intel_pmu_get_status();

>  	perf_sample_data_init(&data, 0);
>  
> @@ -728,6 +728,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>  
>  	loops = 0;
>  again:
> +	intel_pmu_ack_status(status);

So here we write just being read value back to CTRL register and _if_ new
overflow happened in this window we've cleared it without processing.

>  	if (++loops > 100) {
>  		WARN_ONCE(1, "perfevents: irq loop stuck!\n");
>  		perf_event_print_debug();
> @@ -736,7 +737,6 @@ again:
>  	}
>  
>  	inc_irq_stat(apic_perf_irqs);
> -	ack = status;
>  
>  	intel_pmu_lbr_read();
>  
> @@ -761,8 +761,6 @@ again:
>  			x86_pmu_stop(event);
>  	}
>  
> -	intel_pmu_ack_status(ack);
> -

Here we cleared bits in "status" variable and then we read
status register again without cleaning bits in real physical
register which confuses me.

>  	/*
>  	 * Repeat if there is more work to be done:
>  	 */
> -- 
> 1.7.2.2
> 

	-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/