lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBTXvBwwQ1vWkfQRtq7=GxS_KyE94wSOxPaBCffeJ7yc+A@mail.gmail.com>
Date:	Mon, 23 Jun 2014 11:00:18 +0200
From:	Stephane Eranian <eranian@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	"mingo@...e.hu" <mingo@...e.hu>,
	"ak@...ux.intel.com" <ak@...ux.intel.com>,
	Joe Mario <jmario@...hat.com>, Don Zickus <dzickus@...hat.com>,
	Jiri Olsa <jolsa@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [PATCH 2/2] perf/x86: fix constraints for load latency and
 precise events

Peter,

On Mon, Jun 23, 2014 at 10:42 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Thu, Jun 19, 2014 at 05:58:29PM +0200, Stephane Eranian wrote:
>> The load latency does not have to be constrained to counter 3
>> on any of SNB, IVB, HSW. It operates fine on any PEBS-capable
>> counter.
>>
>> The precise store event for SNB, IVB needs to be on counter 3.
>> But on Haswell, precise store is implemented differently and
>> the constraint is not needed anymore, so we remove it.
>>
>> The artificial constraint on counter 3 was used to ease
>> scheduling because the load latency events rely on an
>> extra MSR which is shared for all the counters. But
>> perf_events has an infrastructure to handle shared_regs
>> and does not need to constrain the load latency event to
>> a single counter. It was already using that infrastructure
>> with the constraint on counter 3. By eliminating the constraint
>> on load latency, it becomes possible to measure loads and stores
>> precisely without multiplexing.
>
> So that all makes sense, except why did they pick the same constraint to
> begin with? If they'd picked cnt2 for ll and cnt3 (as per the hardware
> constraint) for st, this would've already been possible right?
>
I don't know why they did it this way. I think somehow, it is believe that
ll and st cannot be captured together (and putting both on cnt3 enforces
that). But when it seems to be working fine. If someone from Intel can
confirm this is okay/not okay then we can revisit.

> Except of course, that the SDM states that no other PEBS event should be
> active when using ll; we don't enforce that (although userspace could
> request exclusive). What about this constraint? Is the SDM wrong about
> this?

For LL this is usually the case if you assume a single measurement is
active. But in system-wide on a shared system, it is possible to have
other events active on the same CPU. I have not tried that to see the
impact on ll.

You can say the same with PREC_DIST which up until HSW needs to be
taken alone, i.e., no other event active. We don't enforce that either, it would
cause problems with the NMI watchdog.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ