lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1112231507120.26100@pianoman.cluster.toy>
Date:	Fri, 23 Dec 2011 15:12:40 -0500 (EST)
From:	Vince Weaver <vince@...ter.net>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Vince Weaver <vweaver1@...s.utk.edu>,
	William Cohen <wcohen@...hat.com>,
	Stephane Eranian <eranian@...gle.com>,
	Arun Sharma <asharma@...com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH 0/6] perf: x86 RDPMC and RDTSC support

On Wed, 21 Dec 2011, Ingo Molnar wrote:

> Here's "pinned events" variant i've measured:
> 
> static u64 mmap_read_self(void *addr)
> {
>         struct perf_event_mmap_page *pc = addr;
>         u32 seq, idx;
>         u64 count;
> 
>         do {
>                 seq = pc->lock;
>                 barrier();
> 
>                 idx = pc->index;
>                 count = pc->offset;
>                 if (idx)
>                         count += rdpmc(idx - 1);
> 
>                 barrier();
>         } while (pc->lock != seq);
> 
>         return count;
> }

currently you need to do at least two rdpmc() calls when doing a 
start/read/stop (I use this as a benchmark as it's what PAPI code commonly 
does).

This is because the pc->offset value isn't initalized to 0 on start,
but to max_period & cntrval_mask.

I'm not sure what perf_event can do about this short of having a separate
field in the mmap structure that doesn't have the overflow offset 
considerations.


As an aside, I notice that the internal perf_event read() routine on x86 
seems to use rdmsrl() instead of the equivelent rdpmc().  From what I 
understand, at least through core2 (and maybe later) rdpmc() is faster 
than the equivelent rdmsr() call.  I'm not sure if would be worth
replacing the calls though.

Vince




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ