linux-kernel - Re: [PATCH 0/10] x86,tlb,mm: more lazy TLB cleanups & optimizations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180730113247.GA21619@gmail.com>
Date:   Mon, 30 Jul 2018 13:32:47 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Rik van Riel <riel@...riel.com>
Cc:     linux-kernel@...r.kernel.org, kernel-team@...com,
        peterz@...radead.org, luto@...nel.org, x86@...nel.org,
        vkuznets@...hat.com, efault@....de, dave.hansen@...el.com,
        will.daecon@....com, catalin.marinas@....com,
        benh@...nel.crashing.org
Subject: Re: [PATCH 0/10] x86,tlb,mm: more lazy TLB cleanups & optimizations


* Rik van Riel <riel@...riel.com> wrote:

> This patch series implements the cleanups suggested by Peter and Andy,
> removes lazy TLB mm refcounting on x86, and shows how other architectures
> could implement that same optimization.
> 
> The previous patch series already seems to have removed most of the
> cache line contention I was seeing at context switch time, so CPU use
> of the memcache and memcache-like workloads has not changed measurably
> with this patch series.
> 
> However, the memory bandwidth used by the memcache system has been
> reduced by about 1%, to serve the same number of queries per second.
> 
> This happens on two socket Haswell and Broadwell systems. Maybe on
> larger systems (4 or 8 socket) one might also see a measurable drop
> in the amount of CPU time used, with workloads where the previous
> patch series does not remove all cache line contention on the mm.
> 
> This is against the latest -tip tree, and seems to be stable (on top
> of another tree) with workloads that do over a million context switches
> a second.

Just a quick logistics request: once all the review feedback from Andy and PeterZ
is sorted out, could you please (re-)send this series with the Reviewed-by
and Acked-by tags added?

If any patch is still under discussion then please leave it out from the next
series temporarily, so that I can just apply them all immediately to tip:x86/mm
before the next merge window opens.

( If the series reaches this state later today then don't hesitate to do a resend
  with the tags added - I don't want to delay these improvements. )

Thanks!

	Ingo