linux-kernel - Re: [RFC PATCH 2/3 v3] perf: Implement Nehalem uncore pmu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1291978335.6803.97.camel@twins>
Date:	Fri, 10 Dec 2010 11:52:15 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Stephane Eranian <eranian@...gle.com>
Cc:	Lin Ming <ming.m.lin@...el.com>, Andi Kleen <andi@...stfloor.org>,
	Ingo Molnar <mingo@...e.hu>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Arjan van de Ven <arjan@...radead.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Cyrill Gorcunov <gorcunov@...il.com>
Subject: Re: [RFC PATCH 2/3 v3] perf: Implement Nehalem uncore pmu

On Fri, 2010-12-10 at 11:47 +0100, Peter Zijlstra wrote:
> On Fri, 2010-12-10 at 00:46 +0100, Stephane Eranian wrote:
> > Hi,
> > 
> > So I have tested this patch a bit on WSM and as I expected there
> > are issues with sampling.
> > 
> > When HT is on, both siblings CPUs get the interrupt. The HW does not
> > allow you to only point interrupts to a single HT thread (CPU).
> 
> Egads, how ugly :/
> 
> > I did verify that indeed both threads get the interrupt and that you have a
> > race condition. Both sibling CPUs stop uncore, get the status. They may get
> > the same overflow status. Both will pass the uncore->active_mask because
> > it's shared among siblings cores. Thus,  you have a race for the whole
> > interrupt handler execution.
> > 
> > You need some serialization in there. But the patch does not address this.
> > The problem is different from the back-to-back interrupt issue that
> > Don worked on.
> > The per-cpu marked/handled trick cannot work to avoid this problem.
> > 
> > You cannot simply say "the lowest indexed" CPU of a sibling pair
> > handles the interrupt
> > because you don't know if this in an uncore intr, core interrupt or
> > something else. You
> > need to check. That means each HT thread needs to check uncore
> > ovfl_status. IF the
> > status is zero, then return. Otherwise, you need to do a 2nd level
> > check before you can
> > execute the handler. You need to know if the sibling CPU has already
> > "consumed" that
> > interrupt.
> > 
> > I think you need some sort of generation counter per physical core and
> > per HT thread.
> > On interrupt, you could do something along the line of:
> >       if (mycpu->intr_count == mysibling->intr_count) {
> >           then mycpu->intr_count++
> >           execute intr_handler()
> >       } else {
> >           mycpu->intr_count++
> >           return;
> >       }
> > Of course, the above needs some atomicity and ad locking 
> 
> Does that guarantee that the same sibling handles all interrupts? Since
> a lot of the infrastructure uses local*_t we're not good with cross-cpu
> stuff.
> 
> Damn what a mess.. we need to serialize enough for both cpus to at least
> see the overflow bit.. maybe something like:
> 
> 
> struct intel_percore {
>    ...
>    atomic_t uncore_barrier;
> };
> 
> void uncore_barrier(void)
> {
> 	struct intel_percore *percore = this_cpu_ptr(cpu_hw_events)->percore;
> 	int armed;
> 
> 	armed = atomic_cmpxchg(&percore->uncore_barrier, 0, 1) == 0;
> 	if (armed) {
> 		/* we armed, it, now wait for completion */
> 		while (atomic_read(&percore->uncore_barrier))
> 			cpu_relax();
> 	} else {
> 		/* our sibling must have, decrement it */
> 		if (atomic_cmpxchg(&percore->uncore_barrier, 1, 0) != 1)
> 			BUG();
> 	}
> }
> 
> Then have something like:
> 
> handle_uncore_interrupt()
> {
> 	u64 overflow = rdmsrl(MSR_UNCORE_PERF_GLOBAL_OVF_STATUS);
> 	int cpu;
> 
> 	if (!overflow)
> 		return 0; /* not our interrupt to handle */
> 
> 	uncore_barrier(); /* wait so our sibling will also observe the overflow */
> 
> 	cpu = smp_processor_id();
> 	if (cpu != cpumask_first(topology_thread_cpumask(cpu)))
> 		return 1; /* our sibling will handle it, eat the NMI */
> 
> 	/* OK, we've got an overflow and we're the first CPU in the thread mask */
> 
> 	... do fancy stuff ...
> 
> 	return 1; /* we handled it, eat the NMI */
> }

That would of course need to also grow some smarts to detect if there is
only 1 sibling online.

CC'ed Cyrill as P4 might have something similar.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/