[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9a4bebe-dde1-438f-bcf7-70b7a5e21848@linaro.org>
Date: Wed, 16 Oct 2024 09:29:16 +0100
From: James Clark <james.clark@...aro.org>
To: Ian Rogers <irogers@...gle.com>
Cc: linux-perf-users@...r.kernel.org, acme@...nel.org, namhyung@...nel.org,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
"Liang, Kan" <kan.liang@...ux.intel.com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid
platform
On 15/10/2024 4:14 pm, Ian Rogers wrote:
> On Tue, Oct 15, 2024 at 7:54 AM James Clark <james.clark@...aro.org> wrote:
>>
>> Since the linked fixes: commit, specifying a CPU on hybrid platforms
>> results in an error because Perf tries to open an extended type event
>> on "any" CPU which isn't valid. Extended type events can only be opened
>> on CPUs that match the type.
>>
>> Before (working):
>>
>> $ perf record --cpu 1 -- true
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>>
>> After (not working):
>>
>> $ perf record -C 1 -- true
>> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
>> Error:
>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
>> /bin/dmesg | grep -i perf may provide additional information.
>>
>> (Ignore the warning message, that's expected and not particularly
>> relevant to this issue).
>>
>> This is because perf_cpu_map__intersect() of the user specified CPU (1)
>> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
>> CPU map. However for the purposes of opening an event, libperf converts
>> empty CPU maps into an any CPU (-1) which the kernel rejects.
>
> Ugh. The cpumap API tries its best to confuse NULL == empty but empty
> can give you dummy, dummy is also known as 'any' or -1, 'any' sounds a
> lot like 'all' but sometimes 'all' is only online CPUs. I tried to
> tidy up the naming a while ago, but there is still a mess.
>
I don't know if you think this is a good opportunity for me to have a go
at finishing separating those? Or is it a dead end?
>> Fix it by deleting evsels with empty CPU maps in the specific case where
>> user requested CPU maps are evaluated.
>
> If we delete evsels than the indices can be broken for certain things.
> I'm guessing asan testing is clean but the large number of side data
> structures that are indexed by things in another data structure makes
> the whole code base brittle and I am nervous around this change.
>
>> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
>> Signed-off-by: James Clark <james.clark@...aro.org>
>
> Reviewed-by: Ian Rogers <irogers@...gle.com>
>
> Thanks,
> Ian
>
Ok if we're not completely opposed to doing it this way I will dig a bit
more and double check everything is working.
>> ---
>> tools/lib/perf/evlist.c | 11 +++++++++--
>> 1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
>> index c6d67fc9e57e..8fae9a157a91 100644
>> --- a/tools/lib/perf/evlist.c
>> +++ b/tools/lib/perf/evlist.c
>> @@ -47,6 +47,13 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>> */
>> perf_cpu_map__put(evsel->cpus);
>> evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
>> +
>> + /*
>> + * Empty cpu lists would eventually get opened as "any" so remove
>> + * genuinely empty ones before they're opened in the wrong place.
>> + */
>> + if (perf_cpu_map__is_empty(evsel->cpus))
>> + perf_evlist__remove(evlist, evsel);
>> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
>> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
>> /*
>> @@ -80,11 +87,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>>
>> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>> {
>> - struct perf_evsel *evsel;
>> + struct perf_evsel *evsel, *n;
>>
>> evlist->needs_map_propagation = true;
>>
>> - perf_evlist__for_each_evsel(evlist, evsel)
>> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
>> __perf_evlist__propagate_maps(evlist, evsel);
>> }
>>
>> --
>> 2.34.1
>>
Powered by blists - more mailing lists