[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <039a2fc2-48e2-fe3b-73c1-f7f658c7f22f@hisilicon.com>
Date: Fri, 24 Mar 2023 10:34:33 +0800
From: Jie Zhan <zhanjie9@...ilicon.com>
To: Yicong Yang <yangyicong@...wei.com>, <acme@...nel.org>,
<mark.rutland@....com>, <peterz@...radead.org>, <mingo@...hat.com>,
<james.clark@....com>, <alexander.shishkin@...ux.intel.com>,
<linux-perf-users@...r.kernel.org>, <linux-kernel@...r.kernel.org>
CC: <Jonathan.Cameron@...wei.com>, <21cnbao@...il.com>,
<tim.c.chen@...el.com>, <prime.zeng@...ilicon.com>,
<shenyang39@...wei.com>, <linuxarm@...wei.com>,
<yangyicong@...ilicon.com>
Subject: Re: [PATCH] perf stat: Support per-cluster aggregation
On 13/03/2023 16:59, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@...ilicon.com>
>
> Some platforms have 'cluster' topology and CPUs in the cluster will
> share resources like L3 Cache Tag (for HiSilicon Kunpeng SoC) or L2
> cache (for Intel Jacobsville). Currently parsing and building cluster
> topology have been supported since [1].
>
> perf stat has already supported aggregation for other topologies like
> die or socket, etc. It'll be useful to aggregate per-cluster to find
> problems like L3T bandwidth contention or imbalance.
>
> This patch adds support for "--per-cluster" option for per-cluster
> aggregation. Also update the docs and related test. The output will
> be like:
>
> [root@...alhost tmp]# perf stat -a -e LLC-load --per-cluster -- sleep 5
>
> Performance counter stats for 'system wide':
>
> S56-D0-CLS158 4 1,321,521,570 LLC-load
> S56-D0-CLS594 4 794,211,453 LLC-load
> S56-D0-CLS1030 4 41,623 LLC-load
> S56-D0-CLS1466 4 41,646 LLC-load
> S56-D0-CLS1902 4 16,863 LLC-load
> S56-D0-CLS2338 4 15,721 LLC-load
> S56-D0-CLS2774 4 22,671 LLC-load
> [...]
>
> [1] commit c5e22feffdd7 ("topology: Represent clusters of CPUs within a die")
>
> Signed-off-by: Yicong Yang <yangyicong@...ilicon.com>
An end user may have to check sysfs to figure out what CPUs those
cluster IDs account for.
Any better method to show the mapping between CPUs and cluster IDs?
Perhaps adding a conditional cluster id (when there are clusters) in the
"--per-core" output may help.
Apart form that, this works well on my aarch64.
Tested-by: Jie Zhan <zhanjie9@...ilicon.com>
Powered by blists - more mailing lists