lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrDzBrrBPBkSKLRC@x1>
Date: Mon, 5 Aug 2024 12:43:02 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Athira Rajeev <atrajeev@...ux.vnet.ibm.com>
Cc: Eric Lin <eric.lin@...ive.com>, Ian Rogers <irogers@...gle.com>,
	Namhyung Kim <namhyung@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>, Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Kan Liang <kan.liang@...ux.intel.com>,
	James Clark <james.clark@....com>,
	linux-perf-users <linux-perf-users@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, vincent.chen@...ive.com,
	greentime.hu@...ive.com, Samuel Holland <samuel.holland@...ive.com>
Subject: Re: [PATCH] perf pmus: Fix duplicate events caused segfault

On Mon, Aug 05, 2024 at 07:54:33PM +0530, Athira Rajeev wrote:
> 
> 
> > On 4 Aug 2024, at 8:36 PM, Eric Lin <eric.lin@...ive.com> wrote:
> > 
> > Hi,
> > 
> > On Sun, Jul 21, 2024 at 11:44 PM Eric Lin <eric.lin@...ive.com> wrote:
> >> 
> >> Hi Athira,
> >> 
> >> On Sat, Jul 20, 2024 at 4:35 PM Athira Rajeev
> >> <atrajeev@...ux.vnet.ibm.com> wrote:
> >>> 
> >>> 
> >>> 
> >>>> On 19 Jul 2024, at 1:46 PM, Eric Lin <eric.lin@...ive.com> wrote:
> >>>> 
> >>>> Currently, if vendor JSON files have two duplicate event names,
> >>>> the "perf list" command will trigger a segfault.
> >>>> 
> >>>> In commit e6ff1eed3584 ("perf pmu: Lazily add JSON events"),
> >>>> pmu_events_table__num_events() gets the number of JSON events
> >>>> from table_pmu->num_entries, which includes duplicate events
> >>>> if there are duplicate event names in the JSON files.
> >>> 
> >>> Hi Eric,
> >>> 
> >>> Let us consider there are duplicate event names in the JSON files, say :
> >>> 
> >>> metric.json with: EventName as pmu_cache_miss, EventCode as 0x1
> >>> cache.json with:  EventName as pmu_cache_miss, EventCode as 0x2
> >>> 
> >>> If we fix the segfault and proceed, still “perf list” will list only one entry for pmu_cache_miss with may be 0x1/0x2 as event code ?
> >>> Can you check the result to confirm what “perf list” will list in this case ? If it’s going to have only one entry in perf list, does it mean there are two event codes for pmu_cache_miss and it can work with either of the event code ?
> >>> 
> >> 
> >> Sorry for the late reply.
> >> Yes, I've checked if there are duplicate pmu_cache_miss events in the
> >> JSON files, the perf list will have only one entry in perf list.
> >> 
> >>> If it happens to be a mistake in json file to have duplicate entry with different event code (ex: with some broken commit), I am thinking if the better fix is to keep only the valid entry in json file ?
> >>> 
> >> 
> >> Yes, I agree we should fix the duplicate events in vendor JSON files.
> >> 
> >> According to this code snippet [1], it seems the perf tool originally
> >> allowed duplicate events to exist and it will skip the duplicate
> >> events not shown on the perf list.
> >> However, after this commit e6ff1eed3584 ("perf pmu: Lazily add JSON
> >> events"),  if there are two duplicate events, it causes a segfault.
> >> 
> >> Can I ask, do you have any suggestions? Thanks.
> >> 
> >> [1] https://github.com/torvalds/linux/blob/master/tools/perf/util/pmus.c#L491
> >> 
> > 
> > Kindly ping.
> > 
> > Can I ask, are there any more comments about this patch? Thanks.
> > 
> Hi Eric,
> 
> The functions there says alias and to skip duplicate alias. I am not sure if that is for events
> 
> Namhyung, Ian, Arnaldo
> Any comments here ?

So I was trying to reproduce the problem here before looking at the
patch, tried a simple:

⬢[acme@...lbox perf-tools-next]$ git diff
diff --git a/tools/perf/pmu-events/arch/x86/rocketlake/cache.json b/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
index 2e93b7835b41442b..167a41b0309b7cfc 100644
--- a/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
+++ b/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
@@ -1,4 +1,13 @@
 [
+    {
+        "BriefDescription": "Counts the number of cache lines replaced in L1 data cache.",
+        "Counter": "0,1,2,3",
+        "EventCode": "0x51",
+        "EventName": "L1D.REPLACEMENT",
+        "PublicDescription": "Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.",
+        "SampleAfterValue": "100003",
+        "UMask": "0x1"
+    },
     {
         "BriefDescription": "Counts the number of cache lines replaced in L1 data cache.",
         "Counter": "0,1,2,3",
⬢[acme@...lbox perf-tools-next]$ grep L1D.REPLACEMENT tools/perf/pmu-events/arch/x86/rocketlake/cache.json
        "EventName": "L1D.REPLACEMENT",
        "EventName": "L1D.REPLACEMENT",
⬢[acme@...lbox perf-tools-next]$

I.e. duplicated that whole event definition:

Did a make clean and a rebuild and:

root@x1:/home/acme/git/pahole# perf list l1d.replacement

List of pre-defined events (to be used in -e or -M):


cache:
  l1d.replacement
       [Counts the number of cache lines replaced in L1 data cache. Unit: cpu_core]
root@x1:/home/acme/git/pahole# perf list > /dev/null
root@x1:/home/acme/git/pahole#

No crash, can you provide instructions on how to reproduce the problem?

I would like to use the experience to add a 'perf test' to show this
failing and then after the patch it passing that new test.

- Arnaldo



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ