linux-kernel - Re: [RFC PATCH 13/37] mm: implement speculative handling in __handle_mm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <C02655BC-F722-4EAD-B93E-D890A2DEC05A@amacapital.net>
Date:   Thu, 29 Apr 2021 11:04:56 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Michel Lespinasse <michel@...pinasse.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Laurent Dufour <ldufour@...ux.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Rik van Riel <riel@...riel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Joel Fernandes <joelaf@...gle.com>,
        Rom Lemarchand <romlem@...gle.com>,
        Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 13/37] mm: implement speculative handling in __handle_mm_fault().



> On Apr 29, 2021, at 9:12 AM, Matthew Wilcox <willy@...radead.org> wrote:
> 
> On Wed, Apr 28, 2021 at 05:05:17PM -0700, Andy Lutomirski wrote:
>>> On Wed, Apr 28, 2021 at 5:02 PM Michel Lespinasse <michel@...pinasse.org> wrote:
>>> Thanks Paul for confirming / clarifying this. BTW, it would be good to
>>> add this to the rcu header files, just so people have something to
>>> reference to when they depend on such behavior (like fast GUP
>>> currently does).
>> 
>> Or, even better, fast GUP could add an explicit RCU read lock.
>> 
>>> 
>>> Going back to my patch. I don't need to protect against THP splitting
>>> here, as I'm only handling the small page case. So when
>>> MMU_GATHER_RCU_TABLE_FREE is enabled, I *think* I could get away with
>>> using only an rcu read lock, instead of disabling interrupts which
>>> implicitly creates the rcu read lock. I'm not sure which way to go -
>>> fast GUP always disables interrupts regardless of the
>>> MMU_GATHER_RCU_TABLE_FREE setting, and I think there is a case to be
>>> made for following the fast GUP stes rather than trying to be smarter.
>> 
>> How about adding some little helpers:
>> 
>> lockless_page_walk_begin();
>> 
>> lockless_page_walk_end();
>> 
>> these turn into RCU read locks if MMU_GATHER_RCU_TABLE_FREE and into
>> irqsave otherwise.  And they're somewhat self-documenting.
> 
> One of the worst things we can do while holding a spinlock is take a
> cache miss because we then delay for several thousand cycles to wait for
> the cache line.  That gives every other CPU a really long opportunity
> to slam into the spinlock and things go downhill fast at that point.
> We've even seen patches to do things like read A, take lock L, then read
> A to avoid the cache miss while holding the lock.
> 
> What sort of performance effect would it have to free page tables
> under RCU for all architectures?  It's painful on s390 & powerpc because
> different tables share the same struct page, but I have to believe that's
> a solvable problem.

The IPI locking mechanism is entirely useless on any architecture that wants to do paravirt shootdowns, so this seems like a good strategy to me.