lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cd84f3b8-4881-c413-4670-f8c86ad10001@synopsys.com>
Date:   Wed, 2 Aug 2017 18:30:43 +0530
From:   Vineet Gupta <Vineet.Gupta1@...opsys.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        "torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
        "will.deacon@....com" <will.deacon@....com>,
        "oleg@...hat.com" <oleg@...hat.com>,
        "paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
        "benh@...nel.crashing.org" <benh@...nel.crashing.org>,
        "mpe@...erman.id.au" <mpe@...erman.id.au>,
        "npiggin@...il.com" <npiggin@...il.com>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "mingo@...nel.org" <mingo@...nel.org>,
        "stern@...land.harvard.edu" <stern@...land.harvard.edu>,
        Russell King <linux@...linux.org.uk>,
        "Heiko Carstens" <heiko.carstens@...ibm.com>,
        Ralf Baechle <ralf@...ux-mips.org>,
        arcml <linux-snps-arc@...ts.infradead.org>,
        "David S. Miller" <davem@...emloft.net>,
        Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>
Subject: ARC stuff (was Re: [PATCH -v2 1/4] mm: Rework
 {set,clear,mm}_tlb_flush_pending())

On 08/02/2017 05:19 PM, Peter Zijlstra wrote:
> Commit:
>
>    af2c1401e6f9 ("mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates")
>
> added smp_mb__before_spinlock() to set_tlb_flush_pending(). I think we
> can solve the same problem without this barrier.
>
> If instead we mandate that mm_tlb_flush_pending() is used while
> holding the PTL we're guaranteed to observe prior
> set_tlb_flush_pending() instances.
>
> For this to work we need to rework migrate_misplaced_transhuge_page()
> a little and move the test up into do_huge_pmd_numa_page().
>
> NOTE: this relies on flush_tlb_range() to guarantee:
>
>     (1) it ensures that prior page table updates are visible to the
>         page table walker and

ARC doesn't have hw page walker so this is not relevant.

>     (2) it ensures that subsequent memory accesses are only made
>         visible after the invalidation has completed

flush_tlb_range() does a bunch of aux register accesses, I need to check with hw 
folks if those can be assumed to serializing w.r.t. memory ordering.
But if not then we need to add an explicit smb barrier (which will not be paired ? 
) and would be penalizing the other callers of flush_tlb_range(). Will a new API 
for this be an overkill ? Is a memory barrier needed here anyways - like ARM !


>
> This is required for architectures that implement TRANSPARENT_HUGEPAGE
> (arc, arm, arm64, mips, powerpc, s390, sparc, x86) or otherwise use
> mm_tlb_flush_pending() in their page-table operations (arm, arm64,
> x86).
>
> This appears true for:
>
>   - arm (DSB ISB before and after),
>   - arm64 (DSB ISHST before, and DSB ISH after),
>   - powerpc (PTESYNC before and after),
>   - s390 and x86 TLB invalidate are serializing instructions
>
> But I failed to understand the situation for:
>
>   - arc, mips, sparc
>
> Now SPARC64 is a wee bit special in that flush_tlb_range() is a no-op
> and it flushes the TLBs using arch_{enter,leave}_lazy_mmu_mode()
> inside the PTL. It still needs to guarantee the PTL unlock happens
> _after_ the invalidate completes.
>
> Vineet, Ralf and Dave could you guys please have a look?
>
> Cc: Russell King <linux@...linux.org.uk>
> Cc: Will Deacon <will.deacon@....com>
> Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>
> Cc: Heiko Carstens <heiko.carstens@...ibm.com>
> Cc: Ralf Baechle <ralf@...ux-mips.org>
> Cc: Vineet Gupta <vgupta@...opsys.com>
> Cc: "David S. Miller" <davem@...emloft.net>
> Cc: Mel Gorman <mgorman@...e.de>
> Cc: Rik van Riel <riel@...hat.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
>   include/linux/mm_types.h |   33 +++++++++++++++++++++++++++------
>   mm/huge_memory.c         |   20 ++++++++++++++++++++
>   mm/migrate.c             |    6 ------
>   3 files changed, 47 insertions(+), 12 deletions(-)
>
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -527,23 +527,44 @@ static inline cpumask_t *mm_cpumask(stru
>    */
>   static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
>   {
> -	barrier();
> +	/*
> +	 * Must be called with PTL held; such that our PTL acquire will have
> +	 * observed the store from set_tlb_flush_pending().
> +	 */
>   	return mm->tlb_flush_pending;
>   }
>   static inline void set_tlb_flush_pending(struct mm_struct *mm)
>   {
>   	mm->tlb_flush_pending = true;
> -
>   	/*
> -	 * Guarantee that the tlb_flush_pending store does not leak into the
> -	 * critical section updating the page tables
> +	 * The only time this value is relevant is when there are indeed pages
> +	 * to flush. And we'll only flush pages after changing them, which
> +	 * requires the PTL.
> +	 *
> +	 * So the ordering here is:
> +	 *
> +	 *	mm->tlb_flush_pending = true;
> +	 *	spin_lock(&ptl);
> +	 *	...
> +	 *	set_pte_at();
> +	 *	spin_unlock(&ptl);
> +	 *
> +	 *				spin_lock(&ptl)
> +	 *				mm_tlb_flush_pending();
> +	 *				....
> +	 *				spin_unlock(&ptl);
> +	 *
> +	 *	flush_tlb_range();
> +	 *	mm->tlb_flush_pending = false;
> +	 *
> +	 * So the =true store is constrained by the PTL unlock, and the =false
> +	 * store is constrained by the TLB invalidate.
>   	 */
> -	smp_mb__before_spinlock();
>   }
>   /* Clearing is done after a TLB flush, which also provides a barrier. */
>   static inline void clear_tlb_flush_pending(struct mm_struct *mm)
>   {
> -	barrier();
> +	/* see set_tlb_flush_pending */
>   	mm->tlb_flush_pending = false;
>   }
>   #else
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1410,6 +1410,7 @@ int do_huge_pmd_numa_page(struct vm_faul
>   	unsigned long haddr = vmf->address & HPAGE_PMD_MASK;
>   	int page_nid = -1, this_nid = numa_node_id();
>   	int target_nid, last_cpupid = -1;
> +	bool need_flush = false;
>   	bool page_locked;
>   	bool migrated = false;
>   	bool was_writable;
> @@ -1496,10 +1497,29 @@ int do_huge_pmd_numa_page(struct vm_faul
>   	}
>   
>   	/*
> +	 * Since we took the NUMA fault, we must have observed the !accessible
> +	 * bit. Make sure all other CPUs agree with that, to avoid them
> +	 * modifying the page we're about to migrate.
> +	 *
> +	 * Must be done under PTL such that we'll observe the relevant
> +	 * set_tlb_flush_pending().
> +	 */
> +	if (mm_tlb_flush_pending(vma->vm_mm))
> +		need_flush = true;
> +
> +	/*
>   	 * Migrate the THP to the requested node, returns with page unlocked
>   	 * and access rights restored.
>   	 */
>   	spin_unlock(vmf->ptl);
> +
> +	/*
> +	 * We are not sure a pending tlb flush here is for a huge page
> +	 * mapping or not. Hence use the tlb range variant
> +	 */
> +	if (need_flush)
> +		flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
> +
>   	migrated = migrate_misplaced_transhuge_page(vma->vm_mm, vma,
>   				vmf->pmd, pmd, vmf->address, page, target_nid);
>   	if (migrated) {
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1937,12 +1937,6 @@ int migrate_misplaced_transhuge_page(str
>   		put_page(new_page);
>   		goto out_fail;
>   	}
> -	/*
> -	 * We are not sure a pending tlb flush here is for a huge page
> -	 * mapping or not. Hence use the tlb range variant
> -	 */
> -	if (mm_tlb_flush_pending(mm))
> -		flush_tlb_range(vma, mmun_start, mmun_end);
>   
>   	/* Prepare a page as a migration target */
>   	__SetPageLocked(new_page);
>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ