linux-kernel - [PATCH 0/7] x86/mm/tlb: make lazy TLB mode even lazier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180924183759.23955-1-riel@surriel.com>
Date:   Mon, 24 Sep 2018 14:37:52 -0400
From:   Rik van Riel <riel@...riel.com>
To:     linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, kernel-team@...com, songliubraving@...com,
        mingo@...nel.org, will.deacon@....com, hpa@...or.com,
        luto@...nel.org, npiggin@...il.com
Subject: [PATCH 0/7] x86/mm/tlb: make lazy TLB mode even lazier

Linus asked me to come up with a smaller patch set to get the benefits
of lazy TLB mode, so I spent some time trying out various permutations
of the code, with a few workloads that do lots of context switches, and
also happen to have a fair number of TLB flushes a second.

Both of the workloads tested are memcache style workloads, running
on two socket systems. One of the workloads has around 300,000
context switches a second, and around 19,000 TLB flushes.

The first patch in the series, of always using lazy TLB mode,
reduces CPU use around 1% on both Haswell and Broadwell systems.

The rest of the series reduces the number of TLB flush IPIs by
about 1,500 a second, resulting in a 0.2% reduction in CPU use,
on top of the 1% seen by just enabling lazy TLB mode.

These are the low hanging fruits in the context switch code.

The big thing remaining is the reference count overhead of
the lazy TLB mm_struct, but getting rid of that is rather a
lot of code for a small performance gain. Not quite what
Linus asked for :)