linux-hardening - Re: [RESEND PATCH v4 05/11] powerpc/64s: Add ability to skip SLB preload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <CCHK05L5IMIU.2P6V21JLTU3E5@oc8246131445.ibm.com>
Date:   Thu, 01 Jul 2021 00:28:21 -0500
From:   "Christopher M. Riedl" <cmr@...ux.ibm.com>
To:     "Nicholas Piggin" <npiggin@...il.com>,
        "Daniel Axtens" <dja@...ens.net>, <linuxppc-dev@...ts.ozlabs.org>
Cc:     <tglx@...utronix.de>, <x86@...nel.org>, <keescook@...omium.org>,
        <linux-hardening@...r.kernel.org>
Subject: Re: [RESEND PATCH v4 05/11] powerpc/64s: Add ability to skip SLB
 preload

On Wed Jun 30, 2021 at 11:15 PM CDT, Nicholas Piggin wrote:
> Excerpts from Christopher M. Riedl's message of July 1, 2021 1:48 pm:
> > On Sun Jun 20, 2021 at 10:13 PM CDT, Daniel Axtens wrote:
> >> "Christopher M. Riedl" <cmr@...ux.ibm.com> writes:
> >>
> >> > Switching to a different mm with Hash translation causes SLB entries to
> >> > be preloaded from the current thread_info. This reduces SLB faults, for
> >> > example when threads share a common mm but operate on different address
> >> > ranges.
> >> >
> >> > Preloading entries from the thread_info struct may not always be
> >> > appropriate - such as when switching to a temporary mm. Introduce a new
> >> > boolean in mm_context_t to skip the SLB preload entirely. Also move the
> >> > SLB preload code into a separate function since switch_slb() is already
> >> > quite long. The default behavior (preloading SLB entries from the
> >> > current thread_info struct) remains unchanged.
> >> >
> >> > Signed-off-by: Christopher M. Riedl <cmr@...ux.ibm.com>
> >> >
> >> > ---
> >> >
> >> > v4:  * New to series.
> >> > ---
> >> >  arch/powerpc/include/asm/book3s/64/mmu.h |  3 ++
> >> >  arch/powerpc/include/asm/mmu_context.h   | 13 ++++++
> >> >  arch/powerpc/mm/book3s64/mmu_context.c   |  2 +
> >> >  arch/powerpc/mm/book3s64/slb.c           | 56 ++++++++++++++----------
> >> >  4 files changed, 50 insertions(+), 24 deletions(-)
> >> >
> >> > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> >> > index eace8c3f7b0a1..b23a9dcdee5af 100644
> >> > --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> >> > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> >> > @@ -130,6 +130,9 @@ typedef struct {
> >> >  	u32 pkey_allocation_map;
> >> >  	s16 execute_only_pkey; /* key holding execute-only protection */
> >> >  #endif
> >> > +
> >> > +	/* Do not preload SLB entries from thread_info during switch_slb() */
> >> > +	bool skip_slb_preload;
> >> >  } mm_context_t;
> >> >  
> >> >  static inline u16 mm_ctx_user_psize(mm_context_t *ctx)
> >> > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> >> > index 4bc45d3ed8b0e..264787e90b1a1 100644
> >> > --- a/arch/powerpc/include/asm/mmu_context.h
> >> > +++ b/arch/powerpc/include/asm/mmu_context.h
> >> > @@ -298,6 +298,19 @@ static inline int arch_dup_mmap(struct mm_struct *oldmm,
> >> >  	return 0;
> >> >  }
> >> >  
> >> > +#ifdef CONFIG_PPC_BOOK3S_64
> >> > +
> >> > +static inline void skip_slb_preload_mm(struct mm_struct *mm)
> >> > +{
> >> > +	mm->context.skip_slb_preload = true;
> >> > +}
> >> > +
> >> > +#else
> >> > +
> >> > +static inline void skip_slb_preload_mm(struct mm_struct *mm) {}
> >> > +
> >> > +#endif /* CONFIG_PPC_BOOK3S_64 */
> >> > +
> >> >  #include <asm-generic/mmu_context.h>
> >> >  
> >> >  #endif /* __KERNEL__ */
> >> > diff --git a/arch/powerpc/mm/book3s64/mmu_context.c b/arch/powerpc/mm/book3s64/mmu_context.c
> >> > index c10fc8a72fb37..3479910264c59 100644
> >> > --- a/arch/powerpc/mm/book3s64/mmu_context.c
> >> > +++ b/arch/powerpc/mm/book3s64/mmu_context.c
> >> > @@ -202,6 +202,8 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
> >> >  	atomic_set(&mm->context.active_cpus, 0);
> >> >  	atomic_set(&mm->context.copros, 0);
> >> >  
> >> > +	mm->context.skip_slb_preload = false;
> >> > +
> >> >  	return 0;
> >> >  }
> >> >  
> >> > diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
> >> > index c91bd85eb90e3..da0836cb855af 100644
> >> > --- a/arch/powerpc/mm/book3s64/slb.c
> >> > +++ b/arch/powerpc/mm/book3s64/slb.c
> >> > @@ -441,10 +441,39 @@ static void slb_cache_slbie_user(unsigned int index)
> >> >  	asm volatile("slbie %0" : : "r" (slbie_data));
> >> >  }
> >> >  
> >> > +static void preload_slb_entries(struct task_struct *tsk, struct mm_struct *mm)
> >> Should this be explicitly inline or even __always_inline? I'm thinking
> >> switch_slb is probably a fairly hot path on hash?
> > 
> > Yes absolutely. I'll make this change in v5.
> > 
> >>
> >> > +{
> >> > +	struct thread_info *ti = task_thread_info(tsk);
> >> > +	unsigned char i;
> >> > +
> >> > +	/*
> >> > +	 * We gradually age out SLBs after a number of context switches to
> >> > +	 * reduce reload overhead of unused entries (like we do with FP/VEC
> >> > +	 * reload). Each time we wrap 256 switches, take an entry out of the
> >> > +	 * SLB preload cache.
> >> > +	 */
> >> > +	tsk->thread.load_slb++;
> >> > +	if (!tsk->thread.load_slb) {
> >> > +		unsigned long pc = KSTK_EIP(tsk);
> >> > +
> >> > +		preload_age(ti);
> >> > +		preload_add(ti, pc);
> >> > +	}
> >> > +
> >> > +	for (i = 0; i < ti->slb_preload_nr; i++) {
> >> > +		unsigned char idx;
> >> > +		unsigned long ea;
> >> > +
> >> > +		idx = (ti->slb_preload_tail + i) % SLB_PRELOAD_NR;
> >> > +		ea = (unsigned long)ti->slb_preload_esid[idx] << SID_SHIFT;
> >> > +
> >> > +		slb_allocate_user(mm, ea);
> >> > +	}
> >> > +}
> >> > +
> >> >  /* Flush all user entries from the segment table of the current processor. */
> >> >  void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
> >> >  {
> >> > -	struct thread_info *ti = task_thread_info(tsk);
> >> >  	unsigned char i;
> >> >  
> >> >  	/*
> >> > @@ -502,29 +531,8 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
> >> >  
> >> >  	copy_mm_to_paca(mm);
> >> >  
> >> > -	/*
> >> > -	 * We gradually age out SLBs after a number of context switches to
> >> > -	 * reduce reload overhead of unused entries (like we do with FP/VEC
> >> > -	 * reload). Each time we wrap 256 switches, take an entry out of the
> >> > -	 * SLB preload cache.
> >> > -	 */
> >> > -	tsk->thread.load_slb++;
> >> > -	if (!tsk->thread.load_slb) {
> >> > -		unsigned long pc = KSTK_EIP(tsk);
> >> > -
> >> > -		preload_age(ti);
> >> > -		preload_add(ti, pc);
> >> > -	}
> >> > -
> >> > -	for (i = 0; i < ti->slb_preload_nr; i++) {
> >> > -		unsigned char idx;
> >> > -		unsigned long ea;
> >> > -
> >> > -		idx = (ti->slb_preload_tail + i) % SLB_PRELOAD_NR;
> >> > -		ea = (unsigned long)ti->slb_preload_esid[idx] << SID_SHIFT;
> >> > -
> >> > -		slb_allocate_user(mm, ea);
> >> > -	}
> >> > +	if (!mm->context.skip_slb_preload)
> >> > +		preload_slb_entries(tsk, mm);
> >>
> >> Should this be wrapped in likely()?
> > 
> > Seems like a good idea - yes.
> > 
> >>
> >> >  
> >> >  	/*
> >> >  	 * Synchronize slbmte preloads with possible subsequent user memory
> >>
> >> Right below this comment is the isync. It seems to be specifically
> >> concerned with synchronising preloaded slbs. Do you need it if you are
> >> skipping SLB preloads?
> >>
> >> It's probably not a big deal to have an extra isync in the fairly rare
> >> path when we're skipping preloads, but I thought I'd check.
> > 
> > I don't _think_ we need the `isync` if we are skipping the SLB preloads,
> > but then again it was always in the code-path before. If someone can
> > make a compelling argument to drop it when not preloading SLBs I will,
> > otherwise (considering some of the other non-obvious things I stepped
> > into with the Hash code) I will keep it here for now.
>
> The ISA says slbia wants an isync afterward, so we probably should keep
> it. The comment is a bit misleading in that case.
>
> Why isn't preloading appropriate for a temporary mm?

The preloaded entries come from the thread_info struct which isn't
necessarily related to the temporary mm at all. I saw SLB multihits
while testing this series with my LKDTM test where the "patching
address" (userspace address for the temporary mapping w/
write-permissions) ends up in a thread's preload list and then we
explicitly insert it again in map_patch() when trying to patch. At that
point the SLB multihit triggers.

>
> Thanks,
> Nick