[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBTXvBwwQ1vWkfQRtq7=GxS_KyE94wSOxPaBCffeJ7yc+A@mail.gmail.com>
Date: Mon, 23 Jun 2014 11:00:18 +0200
From: Stephane Eranian <eranian@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
"mingo@...e.hu" <mingo@...e.hu>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
Joe Mario <jmario@...hat.com>, Don Zickus <dzickus@...hat.com>,
Jiri Olsa <jolsa@...hat.com>,
Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [PATCH 2/2] perf/x86: fix constraints for load latency and
precise events
Peter,
On Mon, Jun 23, 2014 at 10:42 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Thu, Jun 19, 2014 at 05:58:29PM +0200, Stephane Eranian wrote:
>> The load latency does not have to be constrained to counter 3
>> on any of SNB, IVB, HSW. It operates fine on any PEBS-capable
>> counter.
>>
>> The precise store event for SNB, IVB needs to be on counter 3.
>> But on Haswell, precise store is implemented differently and
>> the constraint is not needed anymore, so we remove it.
>>
>> The artificial constraint on counter 3 was used to ease
>> scheduling because the load latency events rely on an
>> extra MSR which is shared for all the counters. But
>> perf_events has an infrastructure to handle shared_regs
>> and does not need to constrain the load latency event to
>> a single counter. It was already using that infrastructure
>> with the constraint on counter 3. By eliminating the constraint
>> on load latency, it becomes possible to measure loads and stores
>> precisely without multiplexing.
>
> So that all makes sense, except why did they pick the same constraint to
> begin with? If they'd picked cnt2 for ll and cnt3 (as per the hardware
> constraint) for st, this would've already been possible right?
>
I don't know why they did it this way. I think somehow, it is believe that
ll and st cannot be captured together (and putting both on cnt3 enforces
that). But when it seems to be working fine. If someone from Intel can
confirm this is okay/not okay then we can revisit.
> Except of course, that the SDM states that no other PEBS event should be
> active when using ll; we don't enforce that (although userspace could
> request exclusive). What about this constraint? Is the SDM wrong about
> this?
For LL this is usually the case if you assume a single measurement is
active. But in system-wide on a shared system, it is possible to have
other events active on the same CPU. I have not tried that to see the
impact on ll.
You can say the same with PREC_DIST which up until HSW needs to be
taken alone, i.e., no other event active. We don't enforce that either, it would
cause problems with the NMI watchdog.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists