[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e9383607-1e43-4c1a-9512-29f27784d035@linux.intel.com>
Date: Wed, 13 Dec 2023 16:11:00 -0500
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Ian Rogers <irogers@...gle.com>
Cc: Namhyung Kim <namhyung@...nel.org>, acme@...nel.org,
mark.rutland@....com, maz@...nel.org, marcan@...can.st,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org
Subject: Re: [PATCH V2] perf top: Use evsel's cpus to replace
user_requested_cpus
On 2023-12-13 12:42 p.m., Ian Rogers wrote:
> On Wed, Dec 13, 2023 at 7:45 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>
>>
>> On 2023-12-12 8:06 p.m., Namhyung Kim wrote:
>>> On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@...gle.com> wrote:
>>>> On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 2023-12-12 3:37 p.m., Ian Rogers wrote:
>>>>>> On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@...ux.intel.com> wrote:
>>>>>>> From: Kan Liang <kan.liang@...ux.intel.com>
>>>>>>>
>>>>>>> perf top errors out on a hybrid machine
>>>>>>> $perf top
>>>>>>>
>>>>>>> Error:
>>>>>>> The cycles:P event is not supported.
>>>>>>>
>>>>>>> The perf top expects that the "cycles" is collected on all CPUs in the
>>>>>>> system. But for hybrid there is no single "cycles" event which can cover
>>>>>>> all CPUs. Perf has to split it into two cycles events, e.g.,
>>>>>>> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask.
>>>>>>> If a event is opened on the unsupported CPU. The open fails. That's the
>>>>>>> reason of the above error out.
>>>>>>>
>>>>>>> Perf should only open the cycles event on the corresponding CPU. The
>>>>>>> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting
>>>>>>> core PMU maps") intersect the requested CPU map with the CPU map of the
>>>>>>> PMU. Use the evsel's cpus to replace user_requested_cpus.
>>>>>>>
>>>>>>> The evlist's threads are also propagated to the evsel's threads in
>>>>>>> __perf_evlist__propagate_maps(). For a system-wide event, perf appends
>>>>>>> a dummy event and assign it to the evsel's threads. For a per-thread
>>>>>>> event, the evlist's thread_map is assigned to the evsel's threads. The
>>>>>>> same as the other tools, e.g., perf record, using the evsel's threads
>>>>>>> when opening an event.
>>>>>>>
>>>>>>> Reported-by: Arnaldo Carvalho de Melo <acme@...nel.org>
>>>>>>> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/
>>>>>>> Reviewed-by: Ian Rogers <irogers@...gle.com>
>>>>>>> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
>>>>>>> ---
>>>>>>>
>>>>>>> Changes since V1:
>>>>>>> - Update the description
>>>>>>> - Add Reviewed-by from Ian
>>>>>> Thanks Kan, quick question. Does "perf top" on hybrid ask the user to
>>>>>> select between the cycles event on cpu_atom and cpu_core?
>>>>> Yes, but the event doesn't include the PMU information.
>>>>> We probably need a follow up patch to append the PMU name.
>>>>>
>>>>> Available samples
>>>>> 385 cycles:P
>>>>>
>>>>> 903 cycles:P
>>>> Thanks and agreed, it isn't possible to tell which is which PMU/CPU
>>>> type at the moment. I tried the patch with perf top --stdio, there
>>>> wasn't a choice of event
>> The perf top --stdio uses a dedicated display function, see
>> perf_top__header_snprintf() in util/top.c
>>
>> It only shows one event at a time. "E" is used to switch the event.
>>
>>>> and I can't tell what counter is being
>>>> displayed.
>> For the hybrid case, I think we may display both PMU name and event
>> name. For example,
>>
>> Available samples
>> 656 cycles:P cpu_atom
>>
>> 701 cycles:P cpu_core
> I think there would be more uniformity if we could do:
> cpu/cycles/P
> I'm just reminded of the stat output where sometimes you can get things like:
> cpu/cycles/
> or sometimes:
> cycles [cpu]
> It seems the slash style approach is the most uniform given it looks
> like what is written on the command line. Perhaps we need some kind of
> helper function to do this as reformatting the modifier is a pain.
Actually, we already have a helper in the perf record,
record__uniquify_name().
I can move it to the util/record.c and let's perf top invoke it before
the open as well. Then we can get this in the perf top.
Available samples
148 cpu_atom/cycles:P/
1K cpu_core/cycles:P/
It should be good enough to distinguish the events.
I will send a patch to support it.
Thanks,
Kan
Powered by blists - more mailing lists