linux-kernel - Re: [RFC PATCH 13/37] mm: implement speculative handling in __handle_mm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210429161234.GG1847222@casper.infradead.org>
Date:   Thu, 29 Apr 2021 17:12:34 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Michel Lespinasse <michel@...pinasse.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Laurent Dufour <ldufour@...ux.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Rik van Riel <riel@...riel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Joel Fernandes <joelaf@...gle.com>,
        Rom Lemarchand <romlem@...gle.com>,
        Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 13/37] mm: implement speculative handling in
 __handle_mm_fault().

On Wed, Apr 28, 2021 at 05:05:17PM -0700, Andy Lutomirski wrote:
> On Wed, Apr 28, 2021 at 5:02 PM Michel Lespinasse <michel@...pinasse.org> wrote:
> > Thanks Paul for confirming / clarifying this. BTW, it would be good to
> > add this to the rcu header files, just so people have something to
> > reference to when they depend on such behavior (like fast GUP
> > currently does).
> 
> Or, even better, fast GUP could add an explicit RCU read lock.
> 
> >
> > Going back to my patch. I don't need to protect against THP splitting
> > here, as I'm only handling the small page case. So when
> > MMU_GATHER_RCU_TABLE_FREE is enabled, I *think* I could get away with
> > using only an rcu read lock, instead of disabling interrupts which
> > implicitly creates the rcu read lock. I'm not sure which way to go -
> > fast GUP always disables interrupts regardless of the
> > MMU_GATHER_RCU_TABLE_FREE setting, and I think there is a case to be
> > made for following the fast GUP stes rather than trying to be smarter.
> 
> How about adding some little helpers:
> 
> lockless_page_walk_begin();
> 
> lockless_page_walk_end();
> 
> these turn into RCU read locks if MMU_GATHER_RCU_TABLE_FREE and into
> irqsave otherwise.  And they're somewhat self-documenting.

One of the worst things we can do while holding a spinlock is take a
cache miss because we then delay for several thousand cycles to wait for
the cache line.  That gives every other CPU a really long opportunity
to slam into the spinlock and things go downhill fast at that point.
We've even seen patches to do things like read A, take lock L, then read
A to avoid the cache miss while holding the lock.

What sort of performance effect would it have to free page tables
under RCU for all architectures?  It's painful on s390 & powerpc because
different tables share the same struct page, but I have to believe that's
a solvable problem.