linux-kernel - Re: [RFC PATCH 13/37] mm: implement speculative handling in __handle_mm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210429193456.GI1847222@casper.infradead.org>
Date:   Thu, 29 Apr 2021 20:34:56 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Michel Lespinasse <michel@...pinasse.org>
Cc:     Andy Lutomirski <luto@...nel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Laurent Dufour <ldufour@...ux.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Rik van Riel <riel@...riel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Joel Fernandes <joelaf@...gle.com>,
        Rom Lemarchand <romlem@...gle.com>,
        Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 13/37] mm: implement speculative handling in
 __handle_mm_fault().

On Thu, Apr 29, 2021 at 12:14:28PM -0700, Michel Lespinasse wrote:
> On Thu, Apr 29, 2021 at 05:12:34PM +0100, Matthew Wilcox wrote:
> > One of the worst things we can do while holding a spinlock is take a
> > cache miss because we then delay for several thousand cycles to wait for
> > the cache line.  That gives every other CPU a really long opportunity
> > to slam into the spinlock and things go downhill fast at that point.
> > We've even seen patches to do things like read A, take lock L, then read
> > A to avoid the cache miss while holding the lock.
> 
> I understand the effect your are describing, but I do not see how it
> applies here - what cacheline are we likely to miss on when using
> local_irq_disable() that we wouldn't touch if using rcu_read_lock() ?

It's the same cache lines in both cases.  The difference is that in one
case we have interrupts disabled (and a spinlock held?  i wasn't clear
on that) and in the other case, we just have preemption disabled.

> > What sort of performance effect would it have to free page tables
> > under RCU for all architectures?  It's painful on s390 & powerpc because
> > different tables share the same struct page, but I have to believe that's
> > a solvable problem.
> 
> I agree using RCU to free page tables would be a good thing to try.
> I am afraid of adding that to this patchset though, as it seems
> somewhate unrelated and adds risk. IMO we are most likely to find
> justification for pushing this if/when we try accessing remote mm's without
> taking the mmap lock, since disabling IPIs clearly wouldn't work there.

I think that needs to happen _before_ this patchset.  Creating a mess and
then trying to clean it up later isn't a great way to do development.