[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9ae47526-8d40-dcd7-4fef-199b8605bfe9@huawei.com>
Date: Mon, 8 May 2023 09:59:07 +0800
From: Yang Jihong <yangjihong1@...wei.com>
To: <peterz@...radead.org>, <linux-perf-users@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
CC: <mingo@...hat.com>, <acme@...nel.org>, <mark.rutland@....com>,
<alexander.shishkin@...ux.intel.com>, <jolsa@...nel.org>,
<namhyung@...nel.org>, <irogers@...gle.com>,
<adrian.hunter@...el.com>
Subject: Re: [PATCH v3] perf/core: Fix perf_sample_data not properly
initialized for different swevents in perf_tp_event()
Hello, Peter
PING.
As shown in Link [1] report, data->raw->frag.data may be accessed in
perf_tp_event_match().
we may need to init sample_data and then go through swevent hlist to
prevent reference of NULL pointer.
Please check whether the solution is OK.
[1] Link:
https://lore.kernel.org/oe-lkp/202304250929.efef2caa-yujie.liu@intel.com
Thanks,
Yang
On 2023/4/25 18:32, Yang Jihong wrote:
> data->sample_flags may be modified in perf_prepare_sample(),
> in perf_tp_event(), different swevents use the same on-stack
> perf_sample_data, the previous swevent may change sample_flags in
> perf_prepare_sample(), as a result, some members of perf_sample_data are
> not correctly initialized when next swevent_event preparing sample
> (for example data->id, the value varies according to swevent).
>
> A simple scenario triggers this problem is as follows:
>
> # perf record -e sched:sched_switch --switch-output-event sched:sched_switch -a sleep 1
> [ perf record: dump data: Woken up 0 times ]
> [ perf record: Dump perf.data.2023041209014396 ]
> [ perf record: dump data: Woken up 0 times ]
> [ perf record: Dump perf.data.2023041209014662 ]
> [ perf record: dump data: Woken up 0 times ]
> [ perf record: Dump perf.data.2023041209014910 ]
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Dump perf.data.2023041209015164 ]
> [ perf record: Captured and wrote 0.069 MB perf.data.<timestamp> ]
> # ls -l
> total 860
> -rw------- 1 root root 95694 Apr 12 09:01 perf.data.2023041209014396
> -rw------- 1 root root 606430 Apr 12 09:01 perf.data.2023041209014662
> -rw------- 1 root root 82246 Apr 12 09:01 perf.data.2023041209014910
> -rw------- 1 root root 82342 Apr 12 09:01 perf.data.2023041209015164
> # perf script -i perf.data.2023041209014396
> 0x11d58 [0x80]: failed to process type: 9 [Bad address]
>
> Solution: Re-initialize perf_sample_data after each event is processed.
> Note that data->raw->frag.data may be accessed in perf_tp_event_match().
> Therefore, need to init sample_data and then go through swevent hlist to prevent
> reference of NULL pointer, reported by [1].
>
> After fix:
>
> # perf record -e sched:sched_switch --switch-output-event sched:sched_switch -a sleep 1
> [ perf record: dump data: Woken up 0 times ]
> [ perf record: Dump perf.data.2023041209442259 ]
> [ perf record: dump data: Woken up 0 times ]
> [ perf record: Dump perf.data.2023041209442514 ]
> [ perf record: dump data: Woken up 0 times ]
> [ perf record: Dump perf.data.2023041209442760 ]
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Dump perf.data.2023041209443003 ]
> [ perf record: Captured and wrote 0.069 MB perf.data.<timestamp> ]
> # ls -l
> total 864
> -rw------- 1 root root 100166 Apr 12 09:44 perf.data.2023041209442259
> -rw------- 1 root root 606438 Apr 12 09:44 perf.data.2023041209442514
> -rw------- 1 root root 82246 Apr 12 09:44 perf.data.2023041209442760
> -rw------- 1 root root 82342 Apr 12 09:44 perf.data.2023041209443003
> # perf script -i perf.data.2023041209442259 | head -n 5
> perf 232 [000] 66.846217: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=D ==> next_comm=perf next_pid=234 next_prio=120
> perf 234 [000] 66.846449: sched:sched_switch: prev_comm=perf prev_pid=234 prev_prio=120 prev_state=S ==> next_comm=perf next_pid=232 next_prio=120
> perf 232 [000] 66.846546: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=234 next_prio=120
> perf 234 [000] 66.846606: sched:sched_switch: prev_comm=perf prev_pid=234 prev_prio=120 prev_state=S ==> next_comm=perf next_pid=232 next_prio=120
> perf 232 [000] 66.846646: sched:sched_switch: prev_comm=perf prev_pid=232 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=234 next_prio=120
>
> [1] Link: https://lore.kernel.org/oe-lkp/202304250929.efef2caa-yujie.liu@intel.com
>
> Fixes: bb447c27a467 ("perf/core: Set data->sample_flags in perf_prepare_sample()")
> Signed-off-by: Yang Jihong <yangjihong1@...wei.com>
> ---
>
> Changes since v2:
> - Initialize perf_sample_data before go through the swevent hlist.
> - Re-initialize perf_sample_data after each swevent is processed.
>
> Changes since v1:
> - Re-initialize the entire perf_sample_data before processing each swevent.
>
> kernel/events/core.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 435815d3be3f..753d4e9665b6 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10150,8 +10150,20 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
> perf_trace_buf_update(record, event_type);
>
> hlist_for_each_entry_rcu(event, head, hlist_entry) {
> - if (perf_tp_event_match(event, &data, regs))
> + if (perf_tp_event_match(event, &data, regs)) {
> perf_swevent_event(event, count, &data, regs);
> +
> + /*
> + * Here use the same on-stack perf_sample_data,
> + * some members in data are event-specific and
> + * need to be re-computed for different sweveents.
> + * Re-initialize data->sample_flags safely to avoid
> + * the problem that next event skips preparing data
> + * because data->sample_flags is set.
> + */
> + perf_sample_data_init(&data, 0, 0);
> + perf_sample_save_raw_data(&data, &raw);
> + }
> }
>
> /*
>
Powered by blists - more mailing lists