[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <25b32870-12e1-b237-648a-3c6fd9678bb9@intel.com>
Date: Mon, 31 Jul 2023 16:01:54 +0300
From: Adrian Hunter <adrian.hunter@...el.com>
To: Yang Jihong <yangjihong1@...wei.com>, peterz@...radead.org,
mingo@...hat.com, acme@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
namhyung@...nel.org, irogers@...gle.com, kan.liang@...ux.intel.com,
james.clark@....com, tmricht@...ux.ibm.com, ak@...ux.intel.com,
anshuman.khandual@....com, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v3 4/7] perf record: Track sideband events for all CPUs
when tracing selected CPUs
On 31/07/23 15:38, Yang Jihong wrote:
> Hello,
>
> On 2023/7/31 19:08, Adrian Hunter wrote:
>> On 22/07/23 12:32, Yang Jihong wrote:
>>> User space tasks can migrate between CPUs, we need to track side-band
>>> events for all CPUs.
>>>
>>> The specific scenarios are as follows:
>>>
>>> CPU0 CPU1
>>> perf record -C 0 start
>>> taskA starts to be created and executed
>>> -> PERF_RECORD_COMM and PERF_RECORD_MMAP
>>> events only deliver to CPU1
>>> ......
>>> |
>>> migrate to CPU0
>>> |
>>> Running on CPU0 <----------/
>>> ...
>>>
>>> perf record -C 0 stop
>>>
>>> Now perf samples the PC of taskA. However, perf does not record the
>>> PERF_RECORD_COMM and PERF_RECORD_MMAP events of taskA.
>>> Therefore, the comm and symbols of taskA cannot be parsed.
>>>
>>> The solution is to record sideband events for all CPUs when tracing
>>> selected CPUs. Because this modifies the default behavior, add related
>>> comments to the perf record man page.
>>>
>>> The sys_perf_event_open invoked is as follows:
>>>
>>> # perf --debug verbose=3 record -e cpu-clock -C 1 true
>>> <SNIP>
>>> Opening: cpu-clock
>>> ------------------------------------------------------------
>>> perf_event_attr:
>>> type 1 (PERF_TYPE_SOFTWARE)
>>> size 136
>>> config 0 (PERF_COUNT_SW_CPU_CLOCK)
>>> { sample_period, sample_freq } 4000
>>> sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER
>>> read_format ID|LOST
>>> disabled 1
>>> inherit 1
>>> freq 1
>>> sample_id_all 1
>>> exclude_guest 1
>>> ------------------------------------------------------------
>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
>>> Opening: dummy:u
>>> ------------------------------------------------------------
>>> perf_event_attr:
>>> type 1 (PERF_TYPE_SOFTWARE)
>>> size 136
>>> config 0x9 (PERF_COUNT_SW_DUMMY)
>>> { sample_period, sample_freq } 1
>>> sample_type IP|TID|TIME|CPU|IDENTIFIER
>>> read_format ID|LOST
>>> inherit 1
>>> exclude_kernel 1
>>> exclude_hv 1
>>> mmap 1
>>> comm 1
>>> task 1
>>> sample_id_all 1
>>> exclude_guest 1
>>> mmap2 1
>>> comm_exec 1
>>> ksymbol 1
>>> bpf_event 1
>>> ------------------------------------------------------------
>>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7
>>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9
>>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10
>>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11
>>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12
>>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13
>>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14
>>> <SNIP>
>>>
>>> Signed-off-by: Yang Jihong <yangjihong1@...wei.com>
>>> ---
>>> tools/perf/Documentation/perf-record.txt | 3 +++
>>> tools/perf/builtin-record.c | 14 +++++++++++++-
>>> 2 files changed, 16 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
>>> index 680396c56bd1..dac53ece51ab 100644
>>> --- a/tools/perf/Documentation/perf-record.txt
>>> +++ b/tools/perf/Documentation/perf-record.txt
>>> @@ -388,6 +388,9 @@ comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-
>>> In per-thread mode with inheritance mode on (default), samples are captured only when
>>> the thread executes on the designated CPUs. Default is to monitor all CPUs.
>>> +User space tasks can migrate between CPUs, so when tracing selected CPUs,
>>> +a dummy event is created to track sideband for all CPUs.
>>> +
>>> -B::
>>> --no-buildid::
>>> Do not save the build ids of binaries in the perf.data files. This skips
>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>> index 3ff9d972225e..4e8e97928f05 100644
>>> --- a/tools/perf/builtin-record.c
>>> +++ b/tools/perf/builtin-record.c
>>> @@ -912,6 +912,7 @@ static int record__config_tracking_events(struct record *rec)
>>> {
>>> struct record_opts *opts = &rec->opts;
>>> struct evlist *evlist = rec->evlist;
>>> + bool system_wide = false;
>>> struct evsel *evsel;
>>> /*
>>> @@ -921,7 +922,18 @@ static int record__config_tracking_events(struct record *rec)
>>> */
>>> if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
>>> perf_pmus__num_core_pmus() > 1) {
>>> - evsel = evlist__findnew_tracking_event(evlist, false);
>>> +
>>> + /*
>>> + * User space tasks can migrate between CPUs, so when tracing
>>> + * selected CPUs, sideband for all CPUs is still needed.
>>> + *
>>> + * If all (non-dummy) evsel have exclude_user,
>>> + * system_wide is not needed.
>>> + */
>>> + if (!!opts->target.cpu_list && !opts->all_kernel)
>>
>> Not everyone uses all-kernel. Can we check the evsels are either dummy
>> or exclude_user?
> For perf_record, exclude_user of all evsels is set in evsel__config(), and record__config_tracking_events() is before evsel__config().
>
> Uh..., it seems that only opts->all_kernel can be used to check exclude_user of evsels.
>
> void evsel__config()
> {
> ...
> if (opts->all_kernel) {
> attr->exclude_kernel = 0;
> attr->exclude_user = 1;
> }
> ...
> }
The parser updates attr in accordance with ":k" etc. I guess
opts->all_kernel or opts->all_user override that as well.
Powered by blists - more mailing lists