linux-kernel - Re: [PATCH 2/7] x86,tlb: leave lazy TLB mode at page table free time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrX+EmeV5PxfwDwO=W4Deu9T_nPj5WbQX0mgxMV08vN=tg@mail.gmail.com>
Date:   Fri, 22 Jun 2018 07:58:43 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Rik van Riel <riel@...riel.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, 86@...r.kernel.org,
        Andrew Lutomirski <luto@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Mike Galbraith <efault@....de>, songliubraving@...com,
        kernel-team <kernel-team@...com>
Subject: Re: [PATCH 2/7] x86,tlb: leave lazy TLB mode at page table free time

On Wed, Jun 20, 2018 at 12:57 PM Rik van Riel <riel@...riel.com> wrote:
>
> Andy discovered that speculative memory accesses while in lazy
> TLB mode can crash a system, when a CPU tries to dereference a
> speculative access using memory contents that used to be valid
> page table memory, but have since been reused for something else
> and point into la-la land.
>
> The latter problem can be prevented in two ways. The first is to
> always send a TLB shootdown IPI to CPUs in lazy TLB mode, while
> the second one is to only send the TLB shootdown at page table
> freeing time.
>
> The second should result in fewer IPIs, since operationgs like
> mprotect and madvise are very common with some workloads, but
> do not involve page table freeing. Also, on munmap, batching
> of page table freeing covers much larger ranges of virtual
> memory than the batching of unmapped user pages.
>
> Signed-off-by: Rik van Riel <riel@...riel.com>
> Tested-by: Song Liu <songliubraving@...com>
> ---
>  arch/x86/include/asm/tlbflush.h |  5 +++++
>  arch/x86/mm/tlb.c               | 24 ++++++++++++++++++++++++
>  include/asm-generic/tlb.h       | 10 ++++++++++
>  mm/memory.c                     | 22 ++++++++++++++--------
>  4 files changed, 53 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index 6690cd3fc8b1..3aa3204b5dc0 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -554,4 +554,9 @@ extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
>         native_flush_tlb_others(mask, info)
>  #endif
>
> +extern void tlb_flush_remove_tables(struct mm_struct *mm);
> +extern void tlb_flush_remove_tables_local(void *arg);
> +
> +#define HAVE_TLB_FLUSH_REMOVE_TABLES
> +
>  #endif /* _ASM_X86_TLBFLUSH_H */
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index e055d1a06699..61773b07ed54 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -646,6 +646,30 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
>         put_cpu();
>  }
>
> +void tlb_flush_remove_tables_local(void *arg)
> +{
> +       struct mm_struct *mm = arg;
> +
> +       if (this_cpu_read(cpu_tlbstate.loaded_mm) == mm &&
> +                       this_cpu_read(cpu_tlbstate.is_lazy))
> +               /*
> +                * We're in lazy mode.  We need to at least flush our
> +                * paging-structure cache to avoid speculatively reading
> +                * garbage into our TLB.  Since switching to init_mm is barely
> +                * slower than a minimal flush, just switch to init_mm.
> +                */
> +               switch_mm_irqs_off(NULL, &init_mm, NULL);

Can you add braces?

> +}
> +
> +void tlb_flush_remove_tables(struct mm_struct *mm)
> +{
> +       int cpu = get_cpu();
> +       /*
> +        * XXX: this really only needs to be called for CPUs in lazy TLB mode.
> +        */
> +       if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
> +               smp_call_function_many(mm_cpumask(mm), tlb_flush_remove_tables_local, (void *)mm, 1);

I suspect that most if the gain will come from fixing this limitation :)