lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <alpine.DEB.2.02.1402061537180.3441@chino.kir.corp.google.com> Date: Thu, 6 Feb 2014 15:48:22 -0800 (PST) From: David Rientjes <rientjes@...gle.com> To: Andrew Morton <akpm@...ux-foundation.org> cc: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>, Fengguang Wu <fengguang.wu@...el.com>, David Cohen <david.a.cohen@...ux.intel.com>, Al Viro <viro@...iv.linux.org.uk>, Damien Ramonda <damien.ramonda@...el.com>, Jan Kara <jack@...e.cz>, Linus <torvalds@...ux-foundation.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org Subject: Re: [RFC PATCH V5] mm readahead: Fix readahead fail for no local memory and limit readahead pages On Thu, 6 Feb 2014, Andrew Morton wrote: > On Thu, 6 Feb 2014 14:58:21 -0800 (PST) David Rientjes <rientjes@...gle.com> wrote: > > > > > +#define MAX_REMOTE_READAHEAD 4096UL > > > > /* > > > > * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a > > > > * sensible upper limit. > > > > */ > > > > unsigned long max_sane_readahead(unsigned long nr) > > > > { > > > > - return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE) > > > > - + node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2); > > > > + unsigned long local_free_page; > > > > + int nid; > > > > + > > > > + nid = numa_node_id(); > > > > If you're intending this to be cached for your calls into > > node_page_state() you need nid = ACCESS_ONCE(numa_node_id()). > > ugh. That's too subtle and we didn't even document it. > > We could put the ACCESS_ONCE inside numa_node_id() I assume but we > still have the same problem as smp_processor_id(): the numa_node_id() > return value is wrong as soon as you obtain it if running preemptibly. > > We could plaster Big Fat Warnings all over the place or we could treat > numa_node_id() and derivatives in the same way as smp_processor_id() > (which is a huge pain). Or something else, but we've left a big hand > grenade here and Raghavendra won't be the last one to pull the pin? > Normally it wouldn't matter because there's no significant downside to it racing, things like mempolicies which use numa_node_id() extensively would result in, oops, a page allocation on the wrong node. This stands out to me, though, because you're expecting the calculation to be correct for a specific node. The patch is still wrong, though, it should just do int node = ACCESS_ONCE(numa_mem_id()); return min(nr, (node_page_state(node, NR_INACTIVE_FILE) + node_page_state(node, NR_FREE_PAGES)) / 2); since we want to readahead based on the cpu's local node, the comment saying we're reading ahead onto "remote memory" is wrong since a memoryless node has local affinity to numa_mem_id(). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists