linux-kernel - Re: [PATCH] x86,switch_mm: skip atomic operations for init

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrXTLqcro4uu5f77EmeaDvNqGZNLpsk-w_BApRNmzvcz=Q@mail.gmail.com>
Date:   Fri, 1 Jun 2018 20:35:29 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     riel@...riel.com
Cc:     Andrew Lutomirski <luto@...nel.org>,
        Mike Galbraith <efault@....de>,
        LKML <linux-kernel@...r.kernel.org>, songliubraving@...com,
        kernel-team <kernel-team@...com>, Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>, X86 ML <x86@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

On Fri, Jun 1, 2018 at 3:13 PM Rik van Riel <riel@...riel.com> wrote:
>
> On Fri, 1 Jun 2018 14:21:58 -0700
> Andy Lutomirski <luto@...nel.org> wrote:
>
> > Hmm.  I wonder if there's a more clever data structure than a bitmap
> > that we could be using here.  Each CPU only ever needs to be in one
> > mm's cpumask, and each cpu only ever changes its own state in the
> > bitmask.  And writes are much less common than reads for most
> > workloads.
>
> It would be easy enough to add an mm_struct pointer to the
> per-cpu tlbstate struct, and iterate over those.
>
> However, that would be an orthogonal change to optimizing
> lazy TLB mode.
>
> Does the (untested) patch below make sense as a potential
> improvement to the lazy TLB heuristic?
>
> ---8<---
> Subject: x86,tlb: workload dependent per CPU lazy TLB switch
>
> Lazy TLB mode is a tradeoff between flushing the TLB and touching
> the mm_cpumask(&init_mm) at context switch time, versus potentially
> incurring a remote TLB flush IPI while in lazy TLB mode.
>
> Whether this pays off is likely to be workload dependent more than
> anything else. However, the current heuristic keys off hardware type.
>
> This patch changes the lazy TLB mode heuristic to a dynamic, per-CPU
> decision, dependent on whether we recently received a remote TLB
> shootdown while in lazy TLB mode.
>
> This is a very simple heuristic. When a CPU receives a remote TLB
> shootdown IPI while in lazy TLB mode, a counter in the same cache
> line is set to 16. Every time we skip lazy TLB mode, the counter
> is decremented.
>
> While the counter is zero (no recent TLB flush IPIs), allow lazy TLB mode.

Hmm, cute.  That's not a bad idea at all.  It would be nice to get
some kind of real benchmark on both PCID and !PCID.  If nothing else,
I would expect the threshold (16 in your patch) to want to be lower on
PCID systems.

--Andy