[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3a7e3842-9383-bdcf-6d20-3b47005cbdae@huawei.com>
Date: Wed, 28 Jun 2023 10:55:08 +0800
From: Yang Jihong <yangjihong1@...wei.com>
To: Ian Rogers <irogers@...gle.com>
CC: <peterz@...radead.org>, <mingo@...hat.com>, <acme@...nel.org>,
<mark.rutland@....com>, <alexander.shishkin@...ux.intel.com>,
<jolsa@...nel.org>, <namhyung@...nel.org>,
<adrian.hunter@...el.com>, <kan.liang@...ux.intel.com>,
<linux-perf-users@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] perf top & record: Fix segfault when default cycles event
is not supported
Hello,
On 2023/6/28 2:32, Ian Rogers wrote:
> On Thu, Jun 15, 2023 at 4:46 AM Yang Jihong <yangjihong1@...wei.com> wrote:
>>
>> Hello,
>>
>> On 2023/6/15 10:04, Ian Rogers wrote:
>>> On Wed, Jun 14, 2023 at 6:55 PM Yang Jihong <yangjihong1@...wei.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> On 2023/6/15 6:03, Ian Rogers wrote:
>>>>> On Wed, Jun 14, 2023 at 9:18 AM Ian Rogers <irogers@...gle.com> wrote:
>>>>>>
>>>>>> On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@...wei.com> wrote:
>>>>>>>
>>>>>>> The perf-record and perf-top call parse_event() to add a cycles event to
>>>>>>> an empty evlist. For the system that does not support hardware cycles
>>>>>>> event, such as QEMU, the evlist is empty due to the following code process:
>>>>>>>
>>>>>>> parse_event(evlist, "cycles:P" or ""cycles:Pu")
>>>>>>> parse_events(evlist, "cycles:P")
>>>>>>> __parse_events
>>>>>>> ...
>>>>>>> ret = parse_events__scanner(str, &parse_state);
>>>>>>> // ret = 0
>>>>>>> ...
>>>>>>> ret2 = parse_events__sort_events_and_fix_groups()
>>>>>>> if (ret2 < 0)
>>>>>>> return ret;
>>>>>>> // The cycles event is not supported, here ret2 = -EINVAL,
>>>>>>> // Here return 0.
>>>>>>> ...
>>>>>>> evlist__splice_list_tail(evlist)
>>>>>>> // The code here does not execute to, so the evlist is still empty.
>>>>>>>
>>>>>>> A null pointer occurs when the content in the evlist is accessed later.
>>>>>>>
>>>>>>> Before:
>>>>>>>
>>>>>>> # perf list hw
>>>>>>>
>>>>>>> List of pre-defined events (to be used in -e or -M):
>>>>>>>
>>>>>>> # perf record true
>>>>>>> libperf: Miscounted nr_mmaps 0 vs 1
>>>>>>> WARNING: No sample_id_all support, falling back to unordered processing
>>>>>>> perf: Segmentation fault
>>>>>>> Obtained 1 stack frames.
>>>>>>> [0xc5beff]
>>>>>>> Segmentation fault
>>>>>>>
>>>>>>> Solution:
>>>>>>> If cycles event is not supported, try to fall back to cpu-clock event.
>>>>>>>
>>>>>>> After:
>>>>>>> # perf record true
>>>>>>> [ perf record: Woken up 1 times to write data ]
>>>>>>> [ perf record: Captured and wrote 0.006 MB perf.data ]
>>>>>>> #
>>>>>>>
>>>>>>> Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default")
>>>>>>> Signed-off-by: Yang Jihong <yangjihong1@...wei.com>
>>>>>>
>>>>>> Thanks, useful addition. The cpu-clock fall back wasn't present before
>>>>>> 7b100989b4f6 so is the fixes tag correct?
>>>>>
>>>>> Hmm... it should be coming from evsel__fallback:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evsel.c?h=tmp.perf-tools-next#n2840
>>>>> so we shouldn't duplicate that logic. The question is why we're not
>>>>> doing the fallback.
>>>>>
>>>>
>>>> Yes, it's a bit of the same logic as evsel__fallback, or we can call
>>>> evlist__add_default() as before, simply create an evsel of hardware
>>>> cycles and add it directly to evlist.
>>>>
>>>> Please confirm whether this solution is feasible. If it is feasible, I
>>>> will send a v2 version.
>>>
>>> The previous evlist__add_default logic didn't handle wildcard PMUs for
>>> cycles, hence wanting to reuse the parse events logic. The error is
>>> that the logic now isn't doing the fallback properly. I think an
>>> evlist__add_cycles which uses evsel__fallback makes sense matching the
>>> previous logic. I'd be happy if you took a look. I'll write a patch so
>>> that the perf_pmus list of core PMUs is never empty.
>>>
>>
>> The gdb calltrace for core dump is as follows:
>>
>> (gdb) bt
>> #0 0x00000000005ffaa2 in __perf_cpu_map__nr (cpus=0x0) at cpumap.c:283
>> #1 0x00000000005ffd17 in perf_cpu_map__max (map=0x0) at cpumap.c:371
>> #2 0x0000000000565644 in cpu_map_data__alloc
>> (syn_data=syn_data@...ry=0x7ffc843bff30,
>> header_size=header_size@...ry=8) at util/synthetic-events.c:1273
>> #3 0x0000000000568712 in cpu_map_event__new (map=<optimized out>) at
>> util/synthetic-events.c:1321
>> #4 perf_event__synthesize_cpu_map (tool=tool@...ry=0xc37580 <record>,
>> map=<optimized out>, process=process@...ry=0x413a80
>> <process_synthesized_event>, machine=machine@...ry=0x0) at
>> util/synthetic-events.c:1341
>> #5 0x000000000041426e in record__synthesize (tail=tail@...ry=false,
>> rec=0xc37580 <record>) at builtin-record.c:2050
>> #6 0x0000000000415a0b in __cmd_record (argc=<optimized out>,
>> argv=<optimized out>, rec=0xc37580 <record>) at builtin-record.c:2512
>> #7 0x0000000000418f22 in cmd_record (argc=<optimized out>,
>> argv=<optimized out>) at builtin-record.c:4260
>> #8 0x00000000004babdd in run_builtin (p=p@...ry=0xc3a0e8
>> <commands+264>, argc=argc@...ry=2, argv=argv@...ry=0x7ffc843c5b30) at
>> perf.c:323
>> #9 0x0000000000401632 in handle_internal_command (argv=0x7ffc843c5b30,
>> argc=2) at perf.c:377
>> #10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at
>> perf.c:421
>> #11 main (argc=2, argv=0x7ffc843c5b30) at perf.c:537
>>
>> The direct cause of the problem is that rec->evlist->core.all_cpus is
>> empty, resulting in null pointer access.
>>
>> The code process is as follows:
>>
>> cmd_record
>> parse_event(rec->evlist)
>> // Hardware cycle events should not be supported here, so rec->evlist
>> is empty
>> ...
>>
>> evlist__create_maps(rec->evlist)
>> perf_evlist__set_maps(&rec->evlist->core)
>> perf_evlist__propagate_maps(&rec->evlist->core)
>> perf_evlist__for_each_evsel(&rec->evlist->core, evsel)
>> // because rec->evlist is empty, don't get into that
>> __perf_evlist__propagate_maps(), so rec->evlist->core.all_cpus is NULL.
>> __perf_evlist__propagate_maps
>> rec->evlist->core.all_cpus = perf_cpu_map__merge(evlist->all_cpus,
>> evsel->cpus);
>> ...
>>
>> __cmd_record
>> record__synthesize
>> perf_event__synthesize_cpu_map(rec->evlist->core.all_cpus)
>> cpu_map_event__new(rec->evlist->core.all_cpus)
>> cpu_map_data__alloc(rec->evlist->core.all_cpus)
>> perf_cpu_map__max(rec->evlist->core.all_cpus)
>> __perf_cpu_map__nr
>> // Here null pointer access!
>> ...
>>
>> record__open
>> evsel__fallback
>> // Here fallback is just starting
>>
>
> Sorry, I don't follow this. I sent out a patch for the no core PMU
> case - please take a look:
> https://lore.kernel.org/lkml/20230627182834.117565-1-irogers@google.com/
Thanks for the patch, it seems to solve the problem. I'll test it now.
> I haven't got a reproduction for failing to open cycles and it's not
> clear to me why evsel__fallback isn't being used to fallback to clock.
> Were you able to look at this?
1. In the QEMU environment that does not support hardware events, as
shown in [1], the sysfs node related to the core PMU does not exist in
the /sys/bus/event_source/devices/ directory.
As a result, evlist is empty after perf_recod invokes
parse_event(evlist, "cycles:P").
2. When perf_record initializes the member in evlist->core, the content
of some members is NULL because evlist is empty. For example, the value
of evlist->core.all_cpus is NULL.
3. Access the content in evlist->core.all_cpus. The null pointer access
is triggered, as shown in [2]. At this time, perf_record has not invoked
evsel__fallback() yet.
4. evsel__fallback() is invoked in record__open(). Therefore, segfault
occurs before fallback to clock.
[1]:
# ls /sys/bus/event_source/devices/
breakpoint kprobe msr software tracepoint uprobe
[2]:
cmd_record
__cmd_record
record__synthesize
perf_event__synthesize_cpu_map(evlist->core.all_cpus)
cpu_map_event__new(evlist->core.all_cpus)
cpu_map_data__alloc(evlist->core.all_cpus)
perf_cpu_map__max(evlist->core.all_cpus)
__perf_cpu_map__nr
// Here null pointer access ...
Thanks,
Yang
Powered by blists - more mailing lists