lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z4B279zu_8Kz5N6u@google.com>
Date: Thu, 9 Jan 2025 17:25:03 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Kan Liang <kan.liang@...ux.intel.com>,
	James Clark <james.clark@...aro.org>, Ze Gao <zegao2021@...il.com>,
	Weilin Wang <weilin.wang@...el.com>,
	Dominique Martinet <asmadeus@...ewreck.org>,
	Jean-Philippe Romain <jean-philippe.romain@...s.st.com>,
	Junhao He <hejunhao3@...wei.com>, linux-perf-users@...r.kernel.org,
	linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
	Aditya Bodkhe <Aditya.Bodkhe1@....com>, Leo Yan <leo.yan@....com>,
	Atish Patra <atishp@...osinc.com>
Subject: Re: [PATCH v5 3/4] perf record: Skip don't fail for events that
 don't open

On Thu, Jan 09, 2025 at 02:21:08PM -0800, Ian Rogers wrote:
> Whilst for many tools it is an expected behavior that failure to open
> a perf event is a failure, ARM decided to name PMU events the same as
> legacy events and then failed to rename such events on a server uncore
> SLC PMU. As perf's default behavior when no PMU is specified is to
> open the event on all PMUs that advertise/"have" the event, this
> yielded failures when trying to make the priority of legacy and
> sysfs/json events uniform - something requested by RISC-V and ARM. A
> legacy event user on ARM hardware may find their event opened on an
> uncore PMU which for perf record will fail. Arnaldo suggested skipping
> such events which this patch implements. Rather than have the skipping
> conditional on running on ARM, the skipping is done on all
> architectures as such a fundamental behavioral difference could lead
> to problems with tools built/depending on perf.
> 
> An example of perf record failing to open events on x86 is:
> ```
> $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> Error:
> Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> "dmesg | grep -i perf" may provide additional information.
> 
> Error:
> Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> "dmesg | grep -i perf" may provide additional information.
> 
> Error:
> Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> The LLC-prefetch-read event is not supported.
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]

I'm afraid this can be too noisy.

> 
> $ perf report --stats
> Aggregated stats:
>                TOTAL events:      17255
>                 MMAP events:        284  ( 1.6%)
>                 COMM events:       1961  (11.4%)
>                 EXIT events:          1  ( 0.0%)
>                 FORK events:       1960  (11.4%)
>               SAMPLE events:         87  ( 0.5%)
>                MMAP2 events:      12836  (74.4%)
>              KSYMBOL events:         83  ( 0.5%)
>            BPF_EVENT events:         36  ( 0.2%)
>       FINISHED_ROUND events:          2  ( 0.0%)
>             ID_INDEX events:          1  ( 0.0%)
>           THREAD_MAP events:          1  ( 0.0%)
>              CPU_MAP events:          1  ( 0.0%)
>            TIME_CONV events:          1  ( 0.0%)
>        FINISHED_INIT events:          1  ( 0.0%)
> cycles stats:
>               SAMPLE events:         87
> ```
> 
> If all events fail to open then the perf record will fail:
> ```
> $ perf record -e LLC-prefetch-read true
> Error:
> Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> The LLC-prefetch-read event is not supported.
> Error:
> Failure to open any events for recording
> ```
> 
> As an evlist may have dummy events that open when all command line
> events fail we ignore dummy events when detecting if at least some
> events open. This still permits the dummy event on its own to be used
> as a permission check:
> ```
> $ perf record -e dummy true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.046 MB perf.data ]
> ```
> but allows failure when a dummy event is implicilty inserted or when
> there are insufficient permissions to open it:
> ```
> $ perf record -e LLC-prefetch-read -a true
> Error:
> Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> The LLC-prefetch-read event is not supported.
> Error:
> Failure to open any events for recording
> ```
> 
> The issue with legacy events is that on RISC-V they want the driver to
> not have mappings from legacy to non-legacy config encodings for each
> vendor/model due to size, complexity and difficulty to update. It was
> reported that on ARM Apple-M? CPUs the legacy mapping in the driver
> was broken and the sysfs/json events should always take precedent,
> however, it isn't clear this is still the case. It is the case that
> without working around this issue a legacy event like cycles without a
> PMU can encode differently than when specified with a PMU - the
> non-PMU version favoring legacy encodings, the PMU one avoiding legacy
> encodings.
> 
> The patch removes events and then adjusts the idx value for each
> evsel. This is done so that the dense xyarrays used for file
> descriptors, etc. don't contain broken entries. As event opening
> happens relatively late in the record process, use of the idx value
> before the open will have become corrupted, so it is expected there
> are latent bugs hidden behind this change - the change is best
> effort. As the only vendor that has broken event names is ARM, this
> will principally effect ARM users. They will also experience warning
> messages like those above because of the uncore PMU advertising legacy
> event names.
> 
> Suggested-by: Arnaldo Carvalho de Melo <acme@...nel.org>
> Signed-off-by: Ian Rogers <irogers@...gle.com>
> Tested-by: James Clark <james.clark@...aro.org>
> Tested-by: Leo Yan <leo.yan@....com>
> Tested-by: Atish Patra <atishp@...osinc.com>
> ---
>  tools/perf/builtin-record.c | 47 ++++++++++++++++++++++++++++++++-----
>  1 file changed, 41 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 5db1aedf48df..c0b8249a3787 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -961,7 +961,6 @@ static int record__config_tracking_events(struct record *rec)
>  	 */
>  	if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
>  	    perf_pmus__num_core_pmus() > 1) {
> -
>  		/*
>  		 * User space tasks can migrate between CPUs, so when tracing
>  		 * selected CPUs, sideband for all CPUs is still needed.
> @@ -1366,6 +1365,7 @@ static int record__open(struct record *rec)
>  	struct perf_session *session = rec->session;
>  	struct record_opts *opts = &rec->opts;
>  	int rc = 0;
> +	bool skipped = false;
>  
>  	evlist__for_each_entry(evlist, pos) {
>  try_again:
> @@ -1381,15 +1381,50 @@ static int record__open(struct record *rec)
>  			        pos = evlist__reset_weak_group(evlist, pos, true);
>  				goto try_again;
>  			}
> -			rc = -errno;
>  			evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
> -			ui__error("%s\n", msg);
> -			goto out;
> +			ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
> +				  evsel__name(pos), evsel__pmu_name(pos), msg);

How about changing it to pr_debug() and add below ...


> +			pos->skippable = true;
> +			skipped = true;
> +		} else {
> +			pos->supported = true;
>  		}
> -
> -		pos->supported = true;
>  	}
>  
> +	if (skipped) {
> +		struct evsel *tmp;
> +		int idx = 0;
> +		bool evlist_empty = true;
> +
> +		/* Remove evsels that failed to open and update indices. */
> +		evlist__for_each_entry_safe(evlist, tmp, pos) {
> +			if (pos->skippable) {
> +				evlist__remove(evlist, pos);
> +				continue;
> +			}
> +
> +			/*
> +			 * Note, dummy events may be command line parsed or
> +			 * added by the tool. We care about supporting `perf
> +			 * record -e dummy` which may be used as a permission
> +			 * check. Dummy events that are added to the command
> +			 * line and opened along with other events that fail,
> +			 * will still fail as if the dummy events were tool
> +			 * added events for the sake of code simplicity.
> +			 */
> +			if (!evsel__is_dummy_event(pos))
> +				evlist_empty = false;
> +		}
> +		evlist__for_each_entry(evlist, pos) {
> +			pos->core.idx = idx++;
> +		}
> +		/* If list is empty then fail. */
> +		if (evlist_empty) {
> +			ui__error("Failure to open any events for recording.\n");
> +			rc = -1;
> +			goto out;
> +		}

... ?

		if (!verbose)
			ui__warning("Removed some unsupported events, use -v for details.\n");

Thanks,
Namhyung


> +	}
>  	if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
>  		pr_warning(
>  "WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
> -- 
> 2.47.1.613.gc27f4b7a9f-goog
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ