lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180626173126.12296-7-riel@surriel.com>
Date:   Tue, 26 Jun 2018 13:31:26 -0400
From:   Rik van Riel <riel@...riel.com>
To:     linux-kernel@...r.kernel.org
Cc:     x86@...nel.org, luto@...nel.org, dave.hansen@...ux.intel.com,
        mingo@...nel.org, kernel-team@...com, tglx@...utronix.de,
        efault@....de, songliubraving@...com,
        Rik van Riel <riel@...riel.com>
Subject: [PATCH 6/6] x86,switch_mm: skip atomic operations for init_mm

Song noticed switch_mm_irqs_off taking a lot of CPU time in recent
kernels,using 1.8% of a 48 CPU system during a netperf to localhost run.
Digging into the profile, we noticed that cpumask_clear_cpu and
cpumask_set_cpu together take about half of the CPU time taken by
switch_mm_irqs_off.

However, the CPUs running netperf end up switching back and forth
between netperf and the idle task, which does not require changes
to the mm_cpumask. Furthermore, the init_mm cpumask ends up being
the most heavily contended one in the system.`

Skipping cpumask_clear_cpu and cpumask_set_cpu for init_mm
(mostly the idle task) reduced CPU use of switch_mm_irqs_off
from 1.8% of the CPU to 0.9% of the CPU, with the following
netperf commandline:

./super_netperf 300 -P 0 -t TCP_RR -p 8888 -H <host> -l 30 \
     -- -r 300,300 -o -s 1M,1M -S 1M,1M

w/o patchset:

Throughput: 1.71264e+06

perf profile:

0.95%  swapper          [kernel.vmlinux]          [k] switch_mm_irqs_off
0.77%  netserver        [kernel.vmlinux]          [k] switch_mm_irqs_off

w/ patchset:

Throughput: 1.74075e+06
0.87%  swapper          [kernel.vmlinux]          [k] switch_mm_irqs_off

CPU use by enter_lazy_tlb is negligible. The bulk of the
savings from switch_mm_irqs_off seem to go towards higher
netperf throughput.

Signed-off-by: Rik van Riel <riel@...riel.com>
Reported-and-tested-by: Song Liu <songliubraving@...com>
---
 arch/x86/mm/tlb.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 96ab4eacda95..ab3992d82c40 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -300,12 +300,15 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 		}
 
 		/* Stop remote flushes for the previous mm */
-		VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu, mm_cpumask(real_prev)) &&
-				real_prev != &init_mm);
-		cpumask_clear_cpu(cpu, mm_cpumask(real_prev));
+		if (real_prev != &init_mm) {
+			VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu,
+					mm_cpumask(real_prev)));
+			cpumask_clear_cpu(cpu, mm_cpumask(real_prev));
+		}
 
 		/* Start remote flushes. */
-		cpumask_set_cpu(cpu, mm_cpumask(next));
+		if (next != &init_mm)
+			cpumask_set_cpu(cpu, mm_cpumask(next));
 	}
 
 	/* Read the tlb_gen to check whether a flush is needed. */
-- 
2.14.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ