lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 3 Nov 2021 12:51:12 +0530
From:   Ravi Bangoria <ravi.bangoria@....com>
To:     <namhyung@...nel.org>, <acme@...nel.org>, <jolsa@...hat.com>
CC:     <mingo@...nel.org>, <peterz@...radead.org>,
        <linux-kernel@...r.kernel.org>, <ak@...ux.intel.com>,
        <irogers@...gle.com>, <eranian@...gle.com>, <kim.phillips@....com>,
        <ravi.bangoria@....com>
Subject: Re: [PATCH v3] perf evsel: Fix missing exclude_{host,guest} setting

> The current logic for the perf missing feature has a bug that it can
> wrongly clear some modifiers like G or H.  Actually some PMUs don't
> support any filtering or exclusion while others do.  But we check it
> as a global feature.

(Sorry to pitch in bit late)

AMD has one more problem on a similar line. On AMD, non-precise and
precise sampling are provided by core and IBS pmu respectively. Plus,
core pmu has filtering capability but IBS does not. Perf by default
sets precise_ip=3 and exclude_guest=1 and goes on decreasing precise_ip
with exclude_guest set until perf_event_open() succeeds. This is
causing perf to always fallback to core pmu (non-precise mode) even if
it's perfectly feasible to do precise sampling. Do you guys think this
problem should also be addressed while designing solution for Namhyung's
patch or solve it seperately like below patch:

---><---

>From 48808299679199c39ff737a30a7f387669314fd7 Mon Sep 17 00:00:00 2001
From: Ravi Bangoria <ravi.bangoria@....com>
Date: Tue, 2 Nov 2021 11:01:12 +0530
Subject: [PATCH] perf/amd/ibs: Don't set exclude_guest by default

Perf tool sets exclude_guest by default while calling perf_event_open().
Because IBS does not have filtering capability, it always gets rejected
by IBS PMU driver and thus perf falls back to non-precise sampling. Fix
it by not setting exclude_guest by default on AMD.

Before:
  $ sudo ./perf record -C 0 -vvv true |& grep precise
    precise_ip                       3
  decreasing precise_ip by one (2)
    precise_ip                       2
  decreasing precise_ip by one (1)
    precise_ip                       1
  decreasing precise_ip by one (0)

After:
  $ sudo ./perf record -C 0 -vvv true |& grep precise
    precise_ip                       3
  decreasing precise_ip by one (2)
    precise_ip                       2

Reported-by: Kim Phillips <kim.phillips@....com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@....com>
---
 tools/perf/arch/x86/util/evsel.c | 23 +++++++++++++++++++++++
 tools/perf/util/evsel.c          | 12 +++++++-----
 tools/perf/util/evsel.h          |  1 +
 3 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index 2f733cdc8dbb..7841a49ce856 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -1,8 +1,31 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <stdio.h>
+#include <stdlib.h>
 #include "util/evsel.h"
+#include "util/env.h"
+#include "linux/string.h"
 
 void arch_evsel__set_sample_weight(struct evsel *evsel)
 {
 	evsel__set_sample_bit(evsel, WEIGHT_STRUCT);
 }
+
+void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
+{
+	struct perf_env env = {0};
+
+	if (!perf_env__cpuid(&env))
+		return;
+
+	/*
+	 * On AMD, precise cycles event sampling internally uses IBS pmu.
+	 * But IBS does not have filtering capabilities and perf by default
+	 * sets exclude_guest = 1. This makes IBS pmu event init fail and
+	 * thus perf ends up doing non-precise sampling. Avoid it by clearing
+	 * exclude_guest.
+	 */
+	if (env.cpuid && strstarts(env.cpuid, "AuthenticAMD"))
+		attr->exclude_guest = 0;
+
+	free(env.cpuid);
+}
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dbfeceb2546c..0b4267d4bb38 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -294,7 +294,7 @@ static bool perf_event_can_profile_kernel(void)
 	return perf_event_paranoid_check(1);
 }
 
-struct evsel *evsel__new_cycles(bool precise, __u32 type, __u64 config)
+struct evsel *evsel__new_cycles(bool precise __maybe_unused, __u32 type, __u64 config)
 {
 	struct perf_event_attr attr = {
 		.type	= type,
@@ -305,18 +305,16 @@ struct evsel *evsel__new_cycles(bool precise, __u32 type, __u64 config)
 
 	event_attr_init(&attr);
 
-	if (!precise)
-		goto new_event;
-
 	/*
 	 * Now let the usual logic to set up the perf_event_attr defaults
 	 * to kick in when we return and before perf_evsel__open() is called.
 	 */
-new_event:
 	evsel = evsel__new(&attr);
 	if (evsel == NULL)
 		goto out;
 
+	arch_evsel__fixup_new_cycles(&evsel->core.attr);
+
 	evsel->precise_max = true;
 
 	/* use asprintf() because free(evsel) assumes name is allocated */
@@ -1047,6 +1045,10 @@ void __weak arch_evsel__set_sample_weight(struct evsel *evsel)
 	evsel__set_sample_bit(evsel, WEIGHT);
 }
 
+void __weak arch_evsel__fixup_new_cycles(struct perf_event_attr *attr __maybe_unused)
+{
+}
+
 /*
  * The enable_on_exec/disabled value strategy:
  *
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 1f7edfa8568a..764e54c6a23d 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -277,6 +277,7 @@ void __evsel__reset_sample_bit(struct evsel *evsel, enum perf_event_sample_forma
 void evsel__set_sample_id(struct evsel *evsel, bool use_sample_identifier);
 
 void arch_evsel__set_sample_weight(struct evsel *evsel);
+void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr);
 
 int evsel__set_filter(struct evsel *evsel, const char *filter);
 int evsel__append_tp_filter(struct evsel *evsel, const char *filter);
-- 
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ