lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Jun 2019 19:35:37 -0700
From:   Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...e.de>
Cc:     Alan Cox <alan.cox@...el.com>, Tony Luck <tony.luck@...el.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Andi Kleen <andi.kleen@...el.com>,
        Hans de Goede <hdegoede@...hat.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jordan Borgner <mail@...dan-borgner.de>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Mohammad Etemadi <mohammad.etemadi@...el.com>,
        Ricardo Neri <ricardo.neri@...el.com>,
        linux-kernel@...r.kernel.org, x86@...nel.org,
        Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
        Andy Shevchenko <andriy.shevchenko@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Peter Feiner <pfeiner@...gle.com>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Subject: [PATCH v2 2/2] x86, mtrr: generic: Skip cache flushes on CPUs with cache self-snooping

Programming MTRR registers in multi-processor systems is a rather lengthy
process. Furthermore, all processors must program these registers in lock
step and with interrupts disabled; the process also involves flushing
caches and TLBs twice. As a result, the process may take a considerable
amount of time.

In some platforms, this can lead to a large skew of the refined-jiffies
clock source. Early when booting, if no other clock is available (e.g.,
booting with hpet=disabled), the refined-jiffies clock source is used to
monitor the TSC clock source. If the skew of refined-jiffies is too large,
Linux wrongly assumes that the TSC is unstable:

  clocksource: timekeeping watchdog on CPU1: Marking clocksource
               'tsc-early' as unstable because the skew is too large:
  clocksource: 'refined-jiffies' wd_now: fffedc10 wd_last:
               fffedb90 mask: ffffffff
  clocksource: 'tsc-early' cs_now: 5eccfddebc cs_last: 5e7e3303d4
               mask: ffffffffffffffff
  tsc: Marking TSC unstable due to clocksource watchdog

As per my measurements, around 98% of the time needed by the procedure to
program MTRRs in multi-processor systems is spent flushing caches with
wbinvd(). As per the Section 11.11.8 of the Intel 64 and IA 32
Architectures Software Developer's Manual, it is not necessary to flush
caches if the CPU supports cache self-snooping. Thus, skipping the cache
flushes can reduce by several tens of milliseconds the time needed to
complete the programming of the MTRR registers.

Cc: Alan Cox <alan.cox@...el.com>
Cc: Tony Luck <tony.luck@...el.com>
Cc: "H. Peter Anvin" <hpa@...or.com>
Cc: Andy Shevchenko <andriy.shevchenko@...el.com>
Cc: Andi Kleen <ak@...ux.intel.com>
Cc: Hans de Goede <hdegoede@...hat.com>
Cc: Peter Feiner <pfeiner@...gle.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: Jordan Borgner <mail@...dan-borgner.de>
Cc: x86@...nel.org
Cc: linux-kernel@...r.kernel.org
Cc: "Ravi V. Shankar" <ravi.v.shankar@...el.com>
Reported-by: Mohammad Etemadi <mohammad.etemadi@...el.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
---
 arch/x86/kernel/cpu/mtrr/generic.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 9356c1c9024d..aa5c064a6a22 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -743,7 +743,15 @@ static void prepare_set(void) __acquires(set_atomicity_lock)
 	/* Enter the no-fill (CD=1, NW=0) cache mode and flush caches. */
 	cr0 = read_cr0() | X86_CR0_CD;
 	write_cr0(cr0);
-	wbinvd();
+
+	/*
+	 * Cache flushing is the most time-consuming step when programming
+	 * the MTRRs. Fortunately, as per the Intel Software Development
+	 * Manual, we can skip it if the processor supports cache self-
+	 * snooping.
+	 */
+	if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
+		wbinvd();
 
 	/* Save value of CR4 and clear Page Global Enable (bit 7) */
 	if (boot_cpu_has(X86_FEATURE_PGE)) {
@@ -760,7 +768,10 @@ static void prepare_set(void) __acquires(set_atomicity_lock)
 
 	/* Disable MTRRs, and set the default type to uncached */
 	mtrr_wrmsr(MSR_MTRRdefType, deftype_lo & ~0xcff, deftype_hi);
-	wbinvd();
+
+	/* Again, only flush caches if we have to. */
+	if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
+		wbinvd();
 }
 
 static void post_set(void) __releases(set_atomicity_lock)
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ