lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 7 Mar 2022 17:43:14 +0200
From:   Jarkko Sakkinen <jarkko@...nel.org>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Dave Hansen <dave.hansen@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Nathaniel McCallum <nathaniel@...fian.com>,
        Reinette Chatre <reinette.chatre@...el.com>,
        linux-sgx@...r.kernel.org, jaharkes@...cmu.edu,
        linux-mips@...r.kernel.org, linux-kernel@...r.kernel.org,
        intel-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
        codalist@...emann.coda.cs.cmu.edu, linux-unionfs@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH RFC v2] mm: Add f_ops->populate()

On Mon, Mar 07, 2022 at 02:37:48PM +0000, Matthew Wilcox wrote:
> On Sun, Mar 06, 2022 at 03:41:54PM -0800, Dave Hansen wrote:
> > In short: page faults stink.  The core kernel has lots of ways of
> > avoiding page faults like madvise(MADV_WILLNEED) or mmap(MAP_POPULATE).
> >  But, those only work on normal RAM that the core mm manages.
> > 
> > SGX is weird.  SGX memory is managed outside the core mm.  It doesn't
> > have a 'struct page' and get_user_pages() doesn't work on it.  Its VMAs
> > are marked with VM_IO.  So, none of the existing methods for avoiding
> > page faults work on SGX memory.
> > 
> > This essentially helps extend existing "normal RAM" kernel ABIs to work
> > for avoiding faults for SGX too.  SGX users want to enjoy all of the
> > benefits of a delayed allocation policy (better resource use,
> > overcommit, NUMA affinity) but without the cost of millions of faults.
> 
> We have a mechanism for dynamically reducing the number of page faults
> already; it's just buried in the page cache code.  You have vma->vm_file,
> which contains a file_ra_state.  You can use this to track where
> recent faults have been and grow the size of the region you fault in
> per page fault.  You don't have to (indeed probably don't want to) use
> the same algorithm as the page cache, but the _principle_ is the same --
> were recent speculative faults actually used; should we grow the number
> of pages actually faulted in, or is this a random sparse workload where
> we want to allocate individual pages.
> 
> Don't rely on the user to ask.  They don't know.

This sounds like a possibility. I'll need to study it properly first
though. Thank you for pointing this out.

BR, Jarkko

Powered by blists - more mailing lists