[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrDzBrrBPBkSKLRC@x1>
Date: Mon, 5 Aug 2024 12:43:02 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Athira Rajeev <atrajeev@...ux.vnet.ibm.com>
Cc: Eric Lin <eric.lin@...ive.com>, Ian Rogers <irogers@...gle.com>,
Namhyung Kim <namhyung@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>,
James Clark <james.clark@....com>,
linux-perf-users <linux-perf-users@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, vincent.chen@...ive.com,
greentime.hu@...ive.com, Samuel Holland <samuel.holland@...ive.com>
Subject: Re: [PATCH] perf pmus: Fix duplicate events caused segfault
On Mon, Aug 05, 2024 at 07:54:33PM +0530, Athira Rajeev wrote:
>
>
> > On 4 Aug 2024, at 8:36 PM, Eric Lin <eric.lin@...ive.com> wrote:
> >
> > Hi,
> >
> > On Sun, Jul 21, 2024 at 11:44 PM Eric Lin <eric.lin@...ive.com> wrote:
> >>
> >> Hi Athira,
> >>
> >> On Sat, Jul 20, 2024 at 4:35 PM Athira Rajeev
> >> <atrajeev@...ux.vnet.ibm.com> wrote:
> >>>
> >>>
> >>>
> >>>> On 19 Jul 2024, at 1:46 PM, Eric Lin <eric.lin@...ive.com> wrote:
> >>>>
> >>>> Currently, if vendor JSON files have two duplicate event names,
> >>>> the "perf list" command will trigger a segfault.
> >>>>
> >>>> In commit e6ff1eed3584 ("perf pmu: Lazily add JSON events"),
> >>>> pmu_events_table__num_events() gets the number of JSON events
> >>>> from table_pmu->num_entries, which includes duplicate events
> >>>> if there are duplicate event names in the JSON files.
> >>>
> >>> Hi Eric,
> >>>
> >>> Let us consider there are duplicate event names in the JSON files, say :
> >>>
> >>> metric.json with: EventName as pmu_cache_miss, EventCode as 0x1
> >>> cache.json with: EventName as pmu_cache_miss, EventCode as 0x2
> >>>
> >>> If we fix the segfault and proceed, still “perf list” will list only one entry for pmu_cache_miss with may be 0x1/0x2 as event code ?
> >>> Can you check the result to confirm what “perf list” will list in this case ? If it’s going to have only one entry in perf list, does it mean there are two event codes for pmu_cache_miss and it can work with either of the event code ?
> >>>
> >>
> >> Sorry for the late reply.
> >> Yes, I've checked if there are duplicate pmu_cache_miss events in the
> >> JSON files, the perf list will have only one entry in perf list.
> >>
> >>> If it happens to be a mistake in json file to have duplicate entry with different event code (ex: with some broken commit), I am thinking if the better fix is to keep only the valid entry in json file ?
> >>>
> >>
> >> Yes, I agree we should fix the duplicate events in vendor JSON files.
> >>
> >> According to this code snippet [1], it seems the perf tool originally
> >> allowed duplicate events to exist and it will skip the duplicate
> >> events not shown on the perf list.
> >> However, after this commit e6ff1eed3584 ("perf pmu: Lazily add JSON
> >> events"), if there are two duplicate events, it causes a segfault.
> >>
> >> Can I ask, do you have any suggestions? Thanks.
> >>
> >> [1] https://github.com/torvalds/linux/blob/master/tools/perf/util/pmus.c#L491
> >>
> >
> > Kindly ping.
> >
> > Can I ask, are there any more comments about this patch? Thanks.
> >
> Hi Eric,
>
> The functions there says alias and to skip duplicate alias. I am not sure if that is for events
>
> Namhyung, Ian, Arnaldo
> Any comments here ?
So I was trying to reproduce the problem here before looking at the
patch, tried a simple:
⬢[acme@...lbox perf-tools-next]$ git diff
diff --git a/tools/perf/pmu-events/arch/x86/rocketlake/cache.json b/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
index 2e93b7835b41442b..167a41b0309b7cfc 100644
--- a/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
+++ b/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
@@ -1,4 +1,13 @@
[
+ {
+ "BriefDescription": "Counts the number of cache lines replaced in L1 data cache.",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x51",
+ "EventName": "L1D.REPLACEMENT",
+ "PublicDescription": "Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Counts the number of cache lines replaced in L1 data cache.",
"Counter": "0,1,2,3",
⬢[acme@...lbox perf-tools-next]$ grep L1D.REPLACEMENT tools/perf/pmu-events/arch/x86/rocketlake/cache.json
"EventName": "L1D.REPLACEMENT",
"EventName": "L1D.REPLACEMENT",
⬢[acme@...lbox perf-tools-next]$
I.e. duplicated that whole event definition:
Did a make clean and a rebuild and:
root@x1:/home/acme/git/pahole# perf list l1d.replacement
List of pre-defined events (to be used in -e or -M):
cache:
l1d.replacement
[Counts the number of cache lines replaced in L1 data cache. Unit: cpu_core]
root@x1:/home/acme/git/pahole# perf list > /dev/null
root@x1:/home/acme/git/pahole#
No crash, can you provide instructions on how to reproduce the problem?
I would like to use the experience to add a 'perf test' to show this
failing and then after the patch it passing that new test.
- Arnaldo
Powered by blists - more mailing lists