lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 7 Apr 2021 04:01:54 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Michel Lespinasse <michel@...pinasse.org>
Cc:     Linux-MM <linux-mm@...ck.org>,
        Laurent Dufour <ldufour@...ux.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Rik van Riel <riel@...riel.com>,
        Paul McKenney <paulmck@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Joel Fernandes <joelaf@...gle.com>,
        Rom Lemarchand <romlem@...gle.com>,
        Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 24/37] mm: implement speculative handling in
 __do_fault()

On Tue, Apr 06, 2021 at 07:53:20PM -0700, Michel Lespinasse wrote:
> On Wed, Apr 07, 2021 at 03:35:27AM +0100, Matthew Wilcox wrote:
> > On Tue, Apr 06, 2021 at 06:44:49PM -0700, Michel Lespinasse wrote:
> > > In the speculative case, call the vm_ops->fault() method from within
> > > an rcu read locked section, and verify the mmap sequence lock at the
> > > start of the section. A match guarantees that the original vma is still
> > > valid at that time, and that the associated vma->vm_file stays valid
> > > while the vm_ops->fault() method is running.
> > > 
> > > Note that this implies that speculative faults can not sleep within
> > > the vm_ops->fault method. We will only attempt to fetch existing pages
> > > from the page cache during speculative faults; any miss (or prefetch)
> > > will be handled by falling back to non-speculative fault handling.
> > > 
> > > The speculative handling case also does not preallocate page tables,
> > > as it is always called with a pre-existing page table.
> > 
> > I still don't understand why you want to do this.  The speculative
> > fault that doesn't do I/O is already here, and it's called ->map_pages
> > (which I see you also do later).  So what's the point of this patch?
> 
> I have to admit I did not give much tought about which path would be
> generally most common here.
> 
> The speculative vm_ops->fault path would be used:
> - for private mapping write faults,
> - when fault-around is disabled (probably an uncommon case in general,
>   but actually common at Google).

Why is it disabled?  The PTE table is already being allocated and filled
... why not quickly check the page cache to see if there are other pages
within this 2MB range and fill in their PTEs too?  Even if only one
of them is ever hit, the reduction in page faults is surely worth it.
Obviously if your workload has such non-locality that you hit only one
page in a 2MB range and then no other, it'll lose ... but then you have
a really badly designed workload!

> That said, I do think your point makes sense in general, espicially if
> this could help avoid the per-filesystem enable bit.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ