lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231107183605.409588-1-seanjc@google.com>
Date:   Tue,  7 Nov 2023 10:36:05 -0800
From:   Sean Christopherson <seanjc@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc:     linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
        Like Xu <likexu@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Mingwei Zhang <mizhang@...gle.com>,
        Sean Christopherson <seanjc@...gle.com>
Subject: [PATCH] perf/x86: Don't enforce minimum period for KVM guest-only events

Don't apply minimum period workarounds/requirements to events that are
being created by KVM to virtualize PMCs for guests, i.e. skip limit
enforcement for events that exclude the host.  Perf's somewhat arbitrary
limits prevents KVM from correctly virtualizing counter overflow, e.g. if
the guest sets a counter to have an effective period of '1', forcing a
minimum period of '2' results in overflow occurring at the incorrect time.

Whether or not a "real" profiling use case is affected is debatable, but
the incorrect behavior is trivially easy to observe and reproduce, and is
deterministic enough to make the PMU appear to be broken from the guest's
perspective.

Furthermore, the "period" set by KVM isn't actually a period, as KVM won't
automatically reprogram the event with the same period on overflow.  KVM
will synthesize a PMI into the guest when appropriate, but what the guest
does in response to the PMI is purely a guest decision.  In other words,
KVM effectively operates in a one-shot mode, not a periodic mode.

Letting KVM and/or the guest program "too small" periods is safe for the
host, as events that exclude the host are atomically disabled with respect
to VM-Exit, i.e. are guaranteed to stop counting upon transitioning to the
host.  And whether or not *explicitly* programming a short period is safe
is somewhat of a moot point, as transitions to/from the guest effectively
yield the same effect, e.g. an unrelated VM-Exit => VM-Enter transition
will re-enable guest PMCs with whatever count happened to be in the PMC at
the time of VM-Exit.

Cc: Like Xu <likexu@...cent.com>
Cc: Jim Mattson <jmattson@...gle.com>
Cc: Mingwei Zhang <mizhang@...gle.com>
Signed-off-by: Sean Christopherson <seanjc@...gle.com>
---

Disclaimer: I've only tested this from KVM's side of things.

 arch/x86/events/core.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 40ad1425ffa2..f8a8a4ea4d47 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1388,16 +1388,25 @@ int x86_perf_event_set_period(struct perf_event *event)
 		hwc->last_period = period;
 		ret = 1;
 	}
-	/*
-	 * Quirk: certain CPUs dont like it if just 1 hw_event is left:
-	 */
-	if (unlikely(left < 2))
-		left = 2;
 
 	if (left > x86_pmu.max_period)
 		left = x86_pmu.max_period;
 
-	static_call_cond(x86_pmu_limit_period)(event, &left);
+	/*
+	 * Exempt KVM guest events from the minimum period requirements.  It's
+	 * the guest's responsibility to ensure it can make forward progress,
+	 * and it's KVM's responsibility to configure an appropriate "period"
+	 * to correctly virtualize overflow for the guest's PMCs.
+	 */
+	if (!event->attr.exclude_host) {
+		/*
+		 * Quirk: certain CPUs dont like it if just 1 event is left:
+		 */
+		if (unlikely(left < 2))
+			left = 2;
+
+		static_call_cond(x86_pmu_limit_period)(event, &left);
+	}
 
 	this_cpu_write(pmc_prev_left[idx], left);
 

base-commit: 744940f1921c8feb90e3c4bcc1e153fdd6e10fe2
-- 
2.42.0.869.gea05f2083d-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ