[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e49505ea-5af2-41d3-23dc-8c01e20f91ee@linux.intel.com>
Date: Wed, 1 Jun 2022 10:04:59 -0400
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Ravi Bangoria <ravi.bangoria@....com>, acme@...nel.org
Cc: jolsa@...nel.org, irogers@...gle.com, peterz@...radead.org,
rrichter@....com, mingo@...hat.com, mark.rutland@....com,
namhyung@...nel.org, tglx@...utronix.de, bp@...en8.de,
james.clark@....com, leo.yan@...aro.org, ak@...ux.intel.com,
eranian@...gle.com, like.xu.linux@...il.com, x86@...nel.org,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
sandipan.das@....com, ananth.narayan@....com, kim.phillips@....com,
santosh.shukla@....com
Subject: Re: [PATCH v5 0/8] perf/amd: Zen4 IBS extensions support (tool
changes)
On 5/31/2022 11:26 PM, Ravi Bangoria wrote:
> Kernel side of changes have already been applied to linus/master
> (except amd-ibs.h header). This series contains perf tool changes.
>
> Kan, I don't have any machine with heterogeneou cpus. It would be
> helpful if you can check HEADER_PMU_CAPS on Intel ADL machine.
>
I tried the patch 2-5 on a hybrid machine. I didn't see any regression
with perf report --header-only option.
Without the patch 2-5,
# perf report --header-only | grep capabilities
# cpu_core pmu capabilities: branches=32, max_precise=3,
pmu_name=alderlake_hybrid
# cpu_atom pmu capabilities: branches=32, max_precise=3,
pmu_name=alderlake_hybrid
With the patch 2-5,
# ./perf report --header-only | grep capabilities
# cpu_core pmu capabilities: branches=32, max_precise=3,
pmu_name=alderlake_hybrid
# cpu_atom pmu capabilities: branches=32, max_precise=3,
pmu_name=alderlake_hybrid
Thanks,
Kan
> v4: https://lore.kernel.org/lkml/20220523033945.1612-1-ravi.bangoria@amd.com
> v4->v5:
> - Replace HEADER_HYBRID_CPU_PMU_CAPS with HEADER_PMU_CAPS instead of
> adding new header HEADER_PMU_CAPS. Special care is taken by writing
> hybrid cpu pmu caps first in the header to make sure old perf tool
> does not break.
> - Store HEADER_CPU_PMU_CAPS capabilities in an array instead of single
> string separated by NULL.
> - Include "cpu" pmu while searching for capabilities in perf_env.
> - Rebase on acme/perf/core (9dde6cadb92b5)
>
> Original cover letter:
>
> IBS support has been enhanced with two new features in upcoming uarch:
> 1. DataSrc extension and 2. L3 Miss Filtering capability. Both are
> indicated by CPUID_Fn8000001B_EAX bit 11.
>
> DataSrc extension provides additional data source details for tagged
> load/store operations. Add support for these new bits in perf report/
> script raw-dump.
>
> IBS L3 miss filtering works by tagging an instruction on IBS counter
> overflow and generating an NMI if the tagged instruction causes an L3
> miss. Samples without an L3 miss are discarded and counter is reset
> with random value (between 1-15 for fetch pmu and 1-127 for op pmu).
> This helps in reducing sampling overhead when user is interested only
> in such samples. One of the use case of such filtered samples is to
> feed data to page-migration daemon in tiered memory systems.
>
> Add support for L3 miss filtering in IBS driver via new pmu attribute
> "l3missonly". Example usage:
>
> # perf record -a -e ibs_op/l3missonly=1/ --raw-samples sleep 5
> # perf report -D
>
> Some important points to keep in mind while using L3 miss filtering:
> 1. Hw internally reset sampling period when tagged instruction does
> not cause L3 miss. But there is no way to reconstruct aggregated
> sampling period when this happens.
> 2. L3 miss is not the actual event being counted. Rather, IBS will
> count fetch, cycles or uOps depending on the configuration. Thus
> sampling period have no direct connection to L3 misses.
>
> 1st causes sampling period skew. Thus, I've added warning message at
> perf record:
>
> # perf record -c 10000 -C 0 -e ibs_op/l3missonly=1/
> WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled
> and tagged operation does not cause L3 Miss. This causes sampling period skew.
>
> User can configure smaller sampling period to get more samples while
> using l3missonly.
>
>
> Ravi Bangoria (8):
> perf record ibs: Warn about sampling period skew
> perf tool: Parse pmu caps sysfs only once
> perf headers: Pass "cpu" pmu name while printing caps
> perf headers: Store pmu caps in an array of strings
> perf headers: Record non-cpu pmu capabilities
> perf/x86/ibs: Add new IBS register bits into header
> perf tool ibs: Sync amd ibs header file
> perf script ibs: Support new IBS bits in raw trace dump
>
> arch/x86/include/asm/amd-ibs.h | 16 +-
> tools/arch/x86/include/asm/amd-ibs.h | 16 +-
> .../Documentation/perf.data-file-format.txt | 10 +-
> tools/perf/arch/x86/util/evsel.c | 49 +++++
> tools/perf/builtin-inject.c | 2 +-
> tools/perf/util/amd-sample-raw.c | 68 +++++-
> tools/perf/util/env.c | 62 +++++-
> tools/perf/util/env.h | 14 +-
> tools/perf/util/evsel.c | 7 +
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/header.c | 196 ++++++++++--------
> tools/perf/util/header.h | 2 +-
> tools/perf/util/pmu.c | 15 +-
> tools/perf/util/pmu.h | 2 +
> 14 files changed, 333 insertions(+), 127 deletions(-)
>
Powered by blists - more mailing lists