lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170530172555.5ya3ilfw3sowokjz@hirez.programming.kicks-ass.net>
Date:   Tue, 30 May 2017 19:25:56 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Vince Weaver <vincent.weaver@...ne.edu>
Cc:     Andi Kleen <ak@...ux.intel.com>,
        Stephane Eranian <eranian@...gle.com>,
        "Liang, Kan" <kan.liang@...el.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "alexander.shishkin@...ux.intel.com" 
        <alexander.shishkin@...ux.intel.com>,
        "acme@...hat.com" <acme@...hat.com>,
        "jolsa@...hat.com" <jolsa@...hat.com>,
        "torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>
Subject: Re: [PATCH 1/2] perf/x86/intel: enable CPU ref_cycles for GP counter

On Wed, May 24, 2017 at 12:01:50PM -0400, Vince Weaver wrote:
> I already have people really grumpy that you have to have one mmap() page 
> per event, meaning you sacrifice one TLB entry for each event you are 
> measuring.

So there is space in that page. We could maybe look at having an array
of stuff (max 32 entries?) which would cover a whole event group.

Then you only need to mmap() the page for the leading event.

Looking at the layout it would be slightly awkward to do (or we need to
ref the layout, which will undoubtedly be painful too). But for groups
the time fields at least are all shared.

At the very least we need index and offset, ideally pmc_width would be
the same for all counters (can we assume that?).

Something like the below for the uapi changes I suppose. I've not
tried to actually implement it yet, but would something like that be
usable?


---
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b1c0b187acfe..40ff77e52b9d 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -345,7 +345,8 @@ struct perf_event_attr {
 				context_switch :  1, /* context switch data */
 				write_backward :  1, /* Write ring buffer from end to beginning */
 				namespaces     :  1, /* include namespaces data */
-				__reserved_1   : 35;
+				group_pmc      :  1, /* use group_pmc in perf_event_mmap_page */
+				__reserved_1   : 34;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -469,7 +470,8 @@ struct perf_event_mmap_page {
 				cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
 				cap_user_time		: 1, /* The time_* fields are used */
 				cap_user_time_zero	: 1, /* The time_zero field is used */
-				cap_____res		: 59;
+				cap_group_pmc		: 1, /* The group_pmc field is used, ignore @index and @offset */
+				cap_____res		: 58;
 		};
 	};
 
@@ -530,11 +532,29 @@ struct perf_event_mmap_page {
 	__u64	time_zero;
 	__u32	size;			/* Header size up to __reserved[] fields. */
 
+	__u32	__reserved_4byte_hole;
+
+	/*
+	 * If cap_group_pmc this array replaces @index and @offset. The array
+	 * will contain an entry for each group member lead by the event belonging
+	 * to this mmap().
+	 *
+	 * The @id field can be used to identify which sibling event the respective
+	 * @index and @offset values belong to. Assuming an immutable group, the
+	 * array index will stay constant for each event.
+	 */
+	struct {
+		__u32	index;
+		__u32	__reserved_hole;
+		__s64	offset;
+		__u64	id;		/* event id */
+	} group_pmc[32];
+
 		/*
 		 * Hole for extension of the self monitor capabilities
 		 */
 
-	__u8	__reserved[118*8+4];	/* align to 1k. */
+	__u8	__reserved[22*8];	/* align to 1k. */
 
 	/*
 	 * Control data for the mmap() data buffer.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ