lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <507B9E6B.9010208@linux.vnet.ibm.com>
Date:	Mon, 15 Oct 2012 10:56:03 +0530
From:	Anshuman Khandual <khandual@...ux.vnet.ibm.com>
To:	Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
CC:	acme@...hat.com, mingo@...nel.org, peterz@...radead.org,
	eranian@...gle.com, robert.richter@....com, asharma@...com,
	mpjohn@...ibm.com, Anton Blanchard <anton@....ibm.com>,
	paulus@...ba.org, linux-kernel@...r.kernel.org,
	linuxppc-dev@...abs.org
Subject: Re: [RFC][PATCH] perf: Add a few generic stalled-cycles events

On 10/12/2012 06:58 AM, Sukadev Bhattiprolu wrote:
> 
> From 89cb6a25b9f714e55a379467a832ee015014ed11 Mon Sep 17 00:00:00 2001
> From: Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
> Date: Tue, 18 Sep 2012 10:59:01 -0700
> Subject: [PATCH] perf: Add a few generic stalled-cycles events
> 
> The existing generic event 'stalled-cycles-backend' corresponds to
> PM_CMPLU_STALL event in Power7. While this event is useful, detailed
> performance analysis often requires us to find more specific reasons
> for the stalled cycle. For instance, stalled cycles in Power7 can
> occur due to, among others:
> 
> 	- instruction fetch unit (IFU),
> 	- Load-store-unit (LSU),
> 	- Fixed point unit (FXU)
> 	- Branch unit (BRU)
> 
> While it is possible to use raw codes to monitor these events, it quickly
> becomes cumbersome with performance analysis frequently requiring mapping
> the raw event codes in reports to their symbolic names.
> 
> This patch is a proposal to try and generalize such perf events. Since
> the code changes are quite simple, I bunched all the 4 events together.
> 
> I am not familiar with how readily these events would map to other
> architectures. Here is some information on the events for Power7:
> 
> 	stalled-cycles-fixed-point (PM_CMPLU_STALL_FXU)
> 
> 		Following a completion stall, the last instruction to finish
> 		before completion resumes was from the Fixed Point Unit.
> 
> 		Completion stall is any period when no groups completed and
> 		the completion table was not empty for that thread.
> 
> 	stalled-cycles-load-store (PM_CMPLU_STALL_LSU)
> 
> 		Following a completion stall, the last instruction to finish
> 		before completion resumes was from the Load-Store Unit.
> 
> 	stalled-cycles-instruction-fetch (PM_CMPLU_STALL_IFU)
> 
> 		Following a completion stall, the last instruction to finish
> 		before completion resumes was from the Instruction Fetch Unit.
> 
> 	stalled-cycles-branch (PM_CMPLU_STALL_BRU)
> 
> 		Following a completion stall, the last instruction to finish
> 		before completion resumes was from the Branch Unit.
> 
> Looking for feedback on this approach and if this can be further extended.
> Power7 has 530 events[2] out of which a "CPI stack analysis"[1] uses about 26
> events.
> 
> 
> [1] CPI Stack analysis
> 	https://www.power.org/documentation/commonly-used-metrics-for-performance-analysis
> 
> [2] Power7 events:
> 	https://www.power.org/documentation/comprehensive-pmu-event-reference-power7/

Here we should try to come up with a generic list of places in the processor where
the cycles can stall.

PERF_COUNT_HW_STALLED_CYCLES_FIXED_POINT
PERF_COUNT_HW_STALLED_CYCLES_LOAD_STORE
PERF_COUNT_HW_STALLED_CYCLES_INSTRUCTION_FETCH
PERF_COUNT_HW_STALLED_CYCLES_BRANCH
PERF_COUNT_HW_STALLED_CYCLES_<ANY_OTHER_PLACE1>
PERF_COUNT_HW_STALLED_CYCLES_<ANY_OTHER_PLACE2>
PERF_COUNT_HW_STALLED_CYCLES_<ANY_OTHER_PLACE3>
-----------------------------------------------

This generic list can be a superset which can accommodate all the architecture
giving the flexibility to implement selectively there after. Stall locations are
very important from CPI analysis stand point with real world use cases. This will
definitely help us in that direction.

Regards
Anshuman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ