lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170802085111.iupsx6s3hw42a52b@hirez.programming.kicks-ass.net>
Date:   Wed, 2 Aug 2017 10:51:11 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Will Deacon <will.deacon@....com>
Cc:     Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        torvalds@...ux-foundation.org, oleg@...hat.com,
        paulmck@...ux.vnet.ibm.com, mpe@...erman.id.au, npiggin@...il.com,
        linux-kernel@...r.kernel.org, mingo@...nel.org,
        stern@...land.harvard.edu, Mel Gorman <mgorman@...e.de>,
        Rik van Riel <riel@...hat.com>
Subject: Re: [RFC][PATCH 1/5] mm: Rework {set,clear,mm}_tlb_flush_pending()

On Wed, Aug 02, 2017 at 09:43:50AM +0100, Will Deacon wrote:
> On Wed, Aug 02, 2017 at 09:15:23AM +0100, Will Deacon wrote:

> > I really think we should avoid defining TLB invalidation in terms of
> > smp_mb() because it's a lot more subtle than that.
> 
> Another worry I have here is with architectures that can optimise the
> "only need to flush the local TLB" case. For example, this version of 'R':
> 
> 
> P0:
> WRITE_ONCE(x, 1);
> smp_mb();
> WRITE_ONCE(y, 1);
> 
> P1:
> WRITE_ONCE(y, 2);
> flush_tlb_range(...);  // Only needs to flush the local TLB
> r0 = READ_ONCE(x);
> 
> 
> It doesn't seem unreasonable to me for y==2 && r0==0 if the
> flush_tlb_range(...) ends up only doing local invalidation. As a concrete
> example, imagine a CPU with a page table walker that can snoop the local
> store-buffer. Then, the local flush_tlb_range in P1 only needs to progress
> the write to y as far as the store-buffer before it can invalidate the local
> TLB. Once the TLB is invalidated, it can read x knowing that the translation
> is up-to-date wrt the page table, but that read doesn't need to wait for
> write to y to become visible to other CPUs.
> 
> So flush_tlb_range is actually weaker than smp_mb in some respects, yet the
> flush_tlb_pending stuff will still work correctly.

So while I think you're right, and we could live with this, after all,
if we know the mm is CPU local, there shouldn't be any SMP concerns wrt
its page tables. Do you really want to make this more complicated?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ