[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM6PR11MB4107255E167D98A000DC49FBDC849@DM6PR11MB4107.namprd11.prod.outlook.com>
Date: Fri, 24 Mar 2023 18:05:50 +0000
From: "Chen, Tim C" <tim.c.chen@...el.com>
To: Yicong Yang <yangyicong@...wei.com>,
"acme@...nel.org" <acme@...nel.org>,
"mark.rutland@....com" <mark.rutland@....com>,
"peterz@...radead.org" <peterz@...radead.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"james.clark@....com" <james.clark@....com>,
"alexander.shishkin@...ux.intel.com"
<alexander.shishkin@...ux.intel.com>,
"linux-perf-users@...r.kernel.org" <linux-perf-users@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC: "Jonathan.Cameron@...wei.com" <Jonathan.Cameron@...wei.com>,
"21cnbao@...il.com" <21cnbao@...il.com>,
"prime.zeng@...ilicon.com" <prime.zeng@...ilicon.com>,
"shenyang39@...wei.com" <shenyang39@...wei.com>,
"linuxarm@...wei.com" <linuxarm@...wei.com>,
"yangyicong@...ilicon.com" <yangyicong@...ilicon.com>
Subject: RE: [PATCH] perf stat: Support per-cluster aggregation
>
>From: Yicong Yang <yangyicong@...ilicon.com>
>
>Some platforms have 'cluster' topology and CPUs in the cluster will share
>resources like L3 Cache Tag (for HiSilicon Kunpeng SoC) or L2 cache (for Intel
>Jacobsville). Currently parsing and building cluster topology have been
>supported since [1].
>
>perf stat has already supported aggregation for other topologies like die or
>socket, etc. It'll be useful to aggregate per-cluster to find problems like L3T
>bandwidth contention or imbalance.
>
>This patch adds support for "--per-cluster" option for per-cluster aggregation.
>Also update the docs and related test. The output will be like:
>
>[root@...alhost tmp]# perf stat -a -e LLC-load --per-cluster -- sleep 5
>
> Performance counter stats for 'system wide':
>
>S56-D0-CLS158 4 1,321,521,570 LLC-load
>S56-D0-CLS594 4 794,211,453 LLC-load
>S56-D0-CLS1030 4 41,623 LLC-load
>S56-D0-CLS1466 4 41,646 LLC-load
>S56-D0-CLS1902 4 16,863 LLC-load
>S56-D0-CLS2338 4 15,721 LLC-load
>S56-D0-CLS2774 4 22,671 LLC-load
>[...]
Overall it looks good. You can add my reviewed-by.
I wonder if we could enhance the help message
in perf stat to tell user to refer to
/sys/devices/system/cpu/cpuX/topology/*_id
to map relevant ids back to overall cpu topology.
For example the above example, cluster S56-D0-CLS158 has
really heavy load. It took me a while
going through the code to figure out how to find
the info that maps cluster id to cpu.
Tim
Powered by blists - more mailing lists