[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220602171120.31166-1-alisaidi@amazon.com>
Date: Thu, 2 Jun 2022 17:11:20 +0000
From: Ali Saidi <alisaidi@...zon.com>
To: <leo.yan@...aro.org>
CC: <acme@...nel.org>, <peterz@...radead.org>, <mingo@...hat.com>,
<mark.rutland@....com>, <alexander.shishkin@...ux.intel.com>,
<jolsa@...nel.org>, <namhyung@...nel.org>, <hi@...ssa.is>,
<irogers@...gle.com>, <likexu@...cent.com>, <kjain@...ux.ibm.com>,
<lihuafei1@...wei.com>, <adam.li@...erecomputing.com>,
<german.gomez@....com>, <james.clark@....com>,
<alisaidi@...zon.com>, <linux-perf-users@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 00/12] perf c2c: Support display for Arm64
Hi Leo,
On Wed, 1 Jun 2022 18:25:05 +0800 Leo Yan wrote:
> Hi Joe,
>
> On Tue, May 31, 2022 at 02:44:07PM -0400, Joe Mario wrote:
>
> [...]
>
> > Hi Leo:
> > I built a new perf with your patches and ran it on a 2-numa node Neoverse platform.
> > I then ran my simple test that creates reader and writer threads to tug on the same cacheline.
> > The c2c output is appended below.
> >
> > The output looks good, especially where you've broken out the (average) cycles for local and remote peer loads.
> > And I'm glad to see you fixed the "Node" column. I use that a lot to help detect remote node accesses.
>
> Thanks a lot for your testing and suggestions, which are really helpful!
>
> > And the "PA cnt" field is working as well, which is important to see if numa_balance is moving the data around.
>
> Good to know this. To be honest, before I didn't note for "PA cnt"
> metric, I checked a bit for the code, this metrics is very useful to
> understand how it's severe that a cache line is accessed from different
> addresses, so we can get sense how a cache line is hammered.
>
> [...]
>
> > Thanks for doing this. It looks good.
>
> You are welcome! And very appreicate your helping to mature the code.
Seconding that, thanks for progressing this so much Leo.
>
> > I'll assume someone else is reviewing your code changes.
>
> Yeah, let's give a bit more time for reviewing.
I've tested and given each patch a close look. I haven't found anything that
looks to change other architectures and the output on my Graviton systems looks
great. I pulled in your patch to add physical addresses to the spe records and
as expected I saw the Node properly populated and PA cnt is no longer zero. One
nit is the documentation still says that "Total HITMs (tot) as default," while
the code now defaults to "peer" on arm64. Other than that:
Tested-by: Ali Saidi <alisaidi@...zon.com>
Reviewed-by: Ali Saidi <alisaidi@...zon.com>
Thanks,
Ali
Powered by blists - more mailing lists