[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ef228b62-c39f-5155-f012-6ed81508e99a@arm.com>
Date: Wed, 13 Dec 2023 13:48:22 +0000
From: James Clark <james.clark@....com>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Mike Leach <mike.leach@...aro.org>,
Leo Yan <leo.yan@...aro.org>,
John Garry <john.g.garry@...cle.com>,
Will Deacon <will@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>,
André Almeida <andrealmeid@...lia.com>,
Kan Liang <kan.liang@...ux.intel.com>,
K Prateek Nayak <kprateek.nayak@....com>,
Sean Christopherson <seanjc@...gle.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Kajol Jain <kjain@...ux.ibm.com>,
Athira Rajeev <atrajeev@...ux.vnet.ibm.com>,
Andrew Jones <ajones@...tanamicro.com>,
Alexandre Ghiti <alexghiti@...osinc.com>,
Atish Patra <atishp@...osinc.com>,
"Steinar H. Gunderson" <sesse@...gle.com>,
Yang Jihong <yangjihong1@...wei.com>,
Yang Li <yang.lee@...ux.alibaba.com>,
Changbin Du <changbin.du@...wei.com>,
Sandipan Das <sandipan.das@....com>,
Ravi Bangoria <ravi.bangoria@....com>,
Paran Lee <p4ranlee@...il.com>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Huacai Chen <chenhuacai@...nel.org>,
Yanteng Si <siyanteng@...ngson.cn>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
coresight@...ts.linaro.org, linux-arm-kernel@...ts.infradead.org,
bpf@...r.kernel.org
Subject: Re: [PATCH v1 06/14] libperf cpumap: Add any, empty and min helpers
On 12/12/2023 20:27, Ian Rogers wrote:
> On Tue, Dec 12, 2023 at 7:06 AM James Clark <james.clark@....com> wrote:
>>
>>
>>
>> On 29/11/2023 06:02, Ian Rogers wrote:
>>> Additional helpers to better replace
>>> perf_cpu_map__has_any_cpu_or_is_empty.
>>>
>>> Signed-off-by: Ian Rogers <irogers@...gle.com>
>>> ---
>>> tools/lib/perf/cpumap.c | 27 +++++++++++++++++++++++++++
>>> tools/lib/perf/include/perf/cpumap.h | 16 ++++++++++++++++
>>> tools/lib/perf/libperf.map | 4 ++++
>>> 3 files changed, 47 insertions(+)
>>>
>>> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
>>> index 49fc98e16514..7403819da8fd 100644
>>> --- a/tools/lib/perf/cpumap.c
>>> +++ b/tools/lib/perf/cpumap.c
>>> @@ -316,6 +316,19 @@ bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map)
>>> return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true;
>>> }
>>>
>>> +bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map)
>>> +{
>>> + if (!map)
>>> + return true;
>>> +
>>> + return __perf_cpu_map__nr(map) == 1 && __perf_cpu_map__cpu(map, 0).cpu == -1;
>>> +}
>>
>> I'm struggling to understand the relevance of the difference between
>> has_any and is_any I see that there is a slight difference, but could it
>> not be refactored out so we only need one?
>
> Yep, that's what these changes are working toward. For has any the set
> {-1, 0, 1} would return true while is any will return false.
> Previously the has any behavior was called "empty" which I think is
> actively misleading.
>
>> Do you ever get an 'any' map that has more than 1 entry? It's quite a
>> subtle difference that is_any returns false if the first one is 'any'
>> but then there are subsequent entries. Whereas has_any would return
>> true. I'm not sure if future readers would be able to appreciate that.
>>
>> I see has_any is only used twice, both on evlist->all_cpus. Is there
>> something about that member that means it could have a map that has an
>> 'any' mixed with CPUs? Wouldn't that have the same result as a normal
>> 'any' anyway?
>
> The dummy event may be opened on any CPU but then a particular event
> may be opened on certain CPUs. We merge CPU maps in places like evlist
> so that we can iterate the appropriate CPUs for events and
> open/enable/disable/close all events on a certain CPU at the same time
> (we also set the affinity to that CPU to avoid IPIs). What I'm hoping
> to do in these changes is reduce the ambiguity, the corner cases are
> by their nature unusual.
>
> An example of a corner case is, uncore events often get opened just on
> CPU 0 but on a multi-socket system you may have a CPU 32 that also
> needs to open the event. Previous code treated the CPU map index and
> value it contained pretty interchangeably. This is often fine for the
> core PMU but is clearly wrong in this uncore case, {0, 32} has indexes
> 0 and 1 but those indexes don't match the CPU numbers. The case of -1
> has often previously been called dummy but I'm trying to call it the
> "any CPU" case to match the perf_event_open man page (I'm hoping it
> also makes it less ambiguous with any CPU being used with a particular
> event like cycles, calling it dummy makes the event sound like it may
> have sideband data). The difference between "all CPUs" and "any CPU"
> is that an evsel for all CPUs would need the event opening
> individually on each CPU, while any CPU events are a single open call.
> Any CPU is only valid to perf_event_open if a PID is specified.
> Depending on the set up there could be overlaps in what they count but
> hopefully it is clearer what the distinction is. I believe the case of
> "any CPU" and specific CPU numbers is more common with aux buffers and
> Adrian has mentioned needing it for intel-pt.
>
> Thanks,
> Ian
>
Thanks for explaining. I suppose I didn't realise that 'any' could be
merged with per-cpu maps, but it makes sense.
>>> +
>>> +bool perf_cpu_map__is_empty(const struct perf_cpu_map *map)
>>> +{
>>> + return map == NULL;
>>> +}
>>> +
>>> int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
>>> {
>>> int low, high;
>>> @@ -372,6 +385,20 @@ bool perf_cpu_map__has_any_cpu(const struct perf_cpu_map *map)
>>> return map && __perf_cpu_map__cpu(map, 0).cpu == -1;
>>> }
>>>
>>> +struct perf_cpu perf_cpu_map__min(const struct perf_cpu_map *map)
>>> +{
>>> + struct perf_cpu cpu, result = {
>>> + .cpu = -1
>>> + };
>>> + int idx;
>>> +
>>> + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, map) {
>>> + result = cpu;
>>> + break;
>>> + }
>>> + return result;
>>> +}
>>> +
>>> struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map)
>>> {
>>> struct perf_cpu result = {
>>> diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
>>> index dbe0a7352b64..523e4348fc96 100644
>>> --- a/tools/lib/perf/include/perf/cpumap.h
>>> +++ b/tools/lib/perf/include/perf/cpumap.h
>>> @@ -50,6 +50,22 @@ LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
>>> * perf_cpu_map__has_any_cpu_or_is_empty - is map either empty or has the "any CPU"/dummy value.
>>> */
>>> LIBPERF_API bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map);
>>> +/**
>>> + * perf_cpu_map__is_any_cpu_or_is_empty - is map either empty or the "any CPU"/dummy value.
>>> + */
>>> +LIBPERF_API bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map);
>>> +/**
>>> + * perf_cpu_map__is_empty - does the map contain no values and it doesn't
>>> + * contain the special "any CPU"/dummy value.
>>> + */
>>> +LIBPERF_API bool perf_cpu_map__is_empty(const struct perf_cpu_map *map);
>>> +/**
>>> + * perf_cpu_map__min - the minimum CPU value or -1 if empty or just the "any CPU"/dummy value.
>>> + */
>>> +LIBPERF_API struct perf_cpu perf_cpu_map__min(const struct perf_cpu_map *map);
>>> +/**
>>> + * perf_cpu_map__max - the maximum CPU value or -1 if empty or just the "any CPU"/dummy value.
>>> + */
>>> LIBPERF_API struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map);
>>> LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, struct perf_cpu cpu);
>>> LIBPERF_API bool perf_cpu_map__equal(const struct perf_cpu_map *lhs,
>>> diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map
>>> index 10b3f3722642..2aa79b696032 100644
>>> --- a/tools/lib/perf/libperf.map
>>> +++ b/tools/lib/perf/libperf.map
>>> @@ -10,6 +10,10 @@ LIBPERF_0.0.1 {
>>> perf_cpu_map__nr;
>>> perf_cpu_map__cpu;
>>> perf_cpu_map__has_any_cpu_or_is_empty;
>>> + perf_cpu_map__is_any_cpu_or_is_empty;
>>> + perf_cpu_map__is_empty;
>>> + perf_cpu_map__has_any_cpu;
>>> + perf_cpu_map__min;
>>> perf_cpu_map__max;
>>> perf_cpu_map__has;
>>> perf_thread_map__new_array;
Powered by blists - more mailing lists