linux-kernel - Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180830102354.GY24124@hirez.programming.kicks-ass.net>
Date:   Thu, 30 Aug 2018 12:23:54 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Vineet Gupta <Vineet.Gupta1@...opsys.com>
Cc:     Nicholas Piggin <npiggin@...il.com>,
        Will Deacon <will.deacon@....com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Benjamin Herrenschmidt <benh@....ibm.com>,
        Andrew Lutomirski <luto@...nel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Rik van Riel <riel@...riel.com>,
        Jann Horn <jannh@...gle.com>,
        Adin Scannell <ascannell@...gle.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        David Miller <davem@...emloft.net>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        "jejb@...isc-linux.org" <jejb@...isc-linux.org>
Subject: Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for
 RCU_TABLE_FREE

On Thu, Aug 30, 2018 at 12:13:50AM +0000, Vineet Gupta wrote:
> On 08/27/2018 04:00 AM, Peter Zijlstra wrote:
> >
> > The one obvious thing SH and ARM want is a sensible default for
> > tlb_start_vma(). (also: https://lkml.org/lkml/2004/1/15/6 )
> >
> > The below make tlb_start_vma() default to flush_cache_range(), which
> > should be right and sufficient. The only exceptions that I found where
> > (oddly):
> >
> >   - m68k-mmu
> >   - sparc64
> >   - unicore
> >
> > Those architectures appear to have a non-NOP flush_cache_range(), but
> > their current tlb_start_vma() does not call it.
> 
> So indeed we follow the DaveM's insight from 2004 about tlb_{start,end}_vma() and
> those are No-ops for ARC for the general case. For the historic VIPT aliasing
> dcache they are what they should be per 2004 link above - I presume that is all
> hunky dory with you ?

Yep, I was just confused about those 3 architectures having
flush_cache_range() but not calling it from tlb_start_vma(). The regular
VIPT aliasing thing is all good. And not having them is also fine.

> > Furthermore, I think tlb_flush() is broken on arc and parisc; in
> > particular they don't appear to have any TLB invalidate for the
> > shift_arg_pages() case, where we do not call tlb_*_vma() and fullmm=0.
> 
> Care to explain this issue a bit more ?
> And that is independent of the current discussion.
> 
> > Possibly shift_arg_pages() should be fixed instead.

So the problem is that shift_arg_pages() does not call tlb_start_vma() /
tlb_end_vma(). It also has fullmm=0. Therefore, on ARC, there will not
be a TLB invalidate at all when freeing the page-tables.

And while looking at this more, move_page_tables() also looks dodgy, I
think it always needs a TLB flush of the entire 'old' range.

> > diff --git a/arch/arc/include/asm/tlb.h b/arch/arc/include/asm/tlb.h
> > index a9db5f62aaf3..7af2b373ebe7 100644
> > --- a/arch/arc/include/asm/tlb.h
> > +++ b/arch/arc/include/asm/tlb.h
> > @@ -23,15 +23,6 @@ do {						\
> >   *
> >   * Note, read http://lkml.org/lkml/2004/1/15/6
> >   */
> > -#ifndef CONFIG_ARC_CACHE_VIPT_ALIASING
> > -#define tlb_start_vma(tlb, vma)
> > -#else
> > -#define tlb_start_vma(tlb, vma)						\
> > -do {									\
> > -	if (!tlb->fullmm)						\
> > -		flush_cache_range(vma, vma->vm_start, vma->vm_end);	\
> > -} while(0)
> > -#endif
> >  
> >  #define tlb_end_vma(tlb, vma)						\
> >  do {									\
> 
> [snip..]
> 
> > 				      \
> > diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> > index e811ef7b8350..1d037fd5bb7a 100644
> > --- a/include/asm-generic/tlb.h
> > +++ b/include/asm-generic/tlb.h
> > @@ -181,19 +181,21 @@ static inline void tlb_remove_check_page_size_change(struct mmu_gather *tlb,
> >   * the vmas are adjusted to only cover the region to be torn down.
> >   */
> >  #ifndef tlb_start_vma
> > -#define tlb_start_vma(tlb, vma) do { } while (0)
> > +#define tlb_start_vma(tlb, vma)						\
> > +do {									\
> > +	if (!tlb->fullmm)						\
> > +		flush_cache_range(vma, vma->vm_start, vma->vm_end);	\
> > +} while (0)
> >  #endif
> 
> So for non aliasing arches to be not affected, this relies on flush_cache_range()
> to be no-op ?

Yes; a cursory inspected shows this to be so. With 'obvious' exception
from the above 3 architectures.

> > -#define __tlb_end_vma(tlb, vma)					\
> > -	do {							\
> > -		if (!tlb->fullmm && tlb->end) {			\
> > -			tlb_flush(tlb);				\
> > -			__tlb_reset_range(tlb);			\
> > -		}						\
> > -	} while (0)
> > -
> >  #ifndef tlb_end_vma
> > -#define tlb_end_vma	__tlb_end_vma
> > +#define tlb_end_vma(tlb, vma)						\
> > +	do {								\
> > +		if (!tlb->fullmm && tlb->end) {				\
> > +			tlb_flush(tlb);					\
> > +			__tlb_reset_range(tlb);				\
> > +		}							\
> > +	} while (0)
> >  #endif
> >  
> >  #ifndef __tlb_remove_tlb_entry
> 
> And this one is for shift_arg_pages() but will also cause extraneous flushes for
> other cases - not happening currently !

No, shift_arg_pages() does not in fact call tlb_end_vma().

The reason for the flush in tlb_end_vma() is (as far as we can remember)
to 'better' deal with large gaps between adjacent VMAs.  Without the
flush here, the range would be extended to cover the (potential) dead
space between the VMAs. Resulting in more expensive range flushes.