lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 7 Mar 2022 17:43:14 +0200 From: Jarkko Sakkinen <jarkko@...nel.org> To: Matthew Wilcox <willy@...radead.org> Cc: Dave Hansen <dave.hansen@...el.com>, Andrew Morton <akpm@...ux-foundation.org>, Dave Hansen <dave.hansen@...ux.intel.com>, Nathaniel McCallum <nathaniel@...fian.com>, Reinette Chatre <reinette.chatre@...el.com>, linux-sgx@...r.kernel.org, jaharkes@...cmu.edu, linux-mips@...r.kernel.org, linux-kernel@...r.kernel.org, intel-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org, codalist@...emann.coda.cs.cmu.edu, linux-unionfs@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-mm@...ck.org Subject: Re: [PATCH RFC v2] mm: Add f_ops->populate() On Mon, Mar 07, 2022 at 02:37:48PM +0000, Matthew Wilcox wrote: > On Sun, Mar 06, 2022 at 03:41:54PM -0800, Dave Hansen wrote: > > In short: page faults stink. The core kernel has lots of ways of > > avoiding page faults like madvise(MADV_WILLNEED) or mmap(MAP_POPULATE). > > But, those only work on normal RAM that the core mm manages. > > > > SGX is weird. SGX memory is managed outside the core mm. It doesn't > > have a 'struct page' and get_user_pages() doesn't work on it. Its VMAs > > are marked with VM_IO. So, none of the existing methods for avoiding > > page faults work on SGX memory. > > > > This essentially helps extend existing "normal RAM" kernel ABIs to work > > for avoiding faults for SGX too. SGX users want to enjoy all of the > > benefits of a delayed allocation policy (better resource use, > > overcommit, NUMA affinity) but without the cost of millions of faults. > > We have a mechanism for dynamically reducing the number of page faults > already; it's just buried in the page cache code. You have vma->vm_file, > which contains a file_ra_state. You can use this to track where > recent faults have been and grow the size of the region you fault in > per page fault. You don't have to (indeed probably don't want to) use > the same algorithm as the page cache, but the _principle_ is the same -- > were recent speculative faults actually used; should we grow the number > of pages actually faulted in, or is this a random sparse workload where > we want to allocate individual pages. > > Don't rely on the user to ask. They don't know. This sounds like a possibility. I'll need to study it properly first though. Thank you for pointing this out. BR, Jarkko
Powered by blists - more mailing lists