lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 31 Dec 2013 16:37:16 +0530 From: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com> To: Linus Torvalds <torvalds@...ux-foundation.org> CC: Andrew Morton <akpm@...ux-foundation.org>, Jan Kara <jack@...e.cz>, Fengguang Wu <fengguang.wu@...el.com>, David Cohen <david.a.cohen@...ux.intel.com>, Al Viro <viro@...iv.linux.org.uk>, Damien Ramonda <damien.ramonda@...el.com>, linux-mm <linux-mm@...ck.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org> Subject: Re: [PATCH RFC] mm readahead: Fix the readahead fail in case of empty numa node On 12/14/2013 06:09 AM, Linus Torvalds wrote: > On Wed, Dec 11, 2013 at 3:05 PM, Andrew Morton > <akpm@...ux-foundation.org> wrote: >> >> But I'm really struggling to think up an implementation! The current >> code looks only at the caller's node and doesn't seem to make much >> sense. Should we look at all nodes? Hard to say without prior >> knowledge of where those pages will be coming from. > > I really think we want to put an upper bound on the read-ahead, and > I'm not convinced we need to try to be excessively clever about it. We > also probably don't want to make it too expensive to calculate, > because afaik this ends up being called for each file we open when we > don't have pages in the page cache yet. > > The current function seems reasonable on a single-node system. Let's > not kill it entirely just because it has some odd corner-case on > multi-node systems. > > In fact, for all I care, I think it would be perfectly ok to just use > a truly stupid hard limit ("you can't read-ahead more than 16MB" or > whatever). > > What we do *not* want to allow is to have people call "readahead" > functions and basically kill the machine because you now have a > unkillable IO that is insanely big. So I'd much rather limit it too > much than too little. And on absolutely no sane IO susbsystem does it > make sense to read ahead insane amounts. > > So I'd rather limit it to something stupid and small, than to not > limit things at all. > > Looking at the interface, for example, the natural thing to do for the > "readahead()" system call, for example, is to just give it a size of > ~0ul, and let the system limit things, becaue limiting things in useer > space is just not reasonable. > > So I really do *not* think it's fine to just remove the limit entirely. > Very sorry for late reply (was on very loong vacation). How about having 16MB limit only for remote readaheads and continuing the rest as is, something like below: #define MAX_REMOTE_READAHEAD 4096UL unsigned long max_sane_readahead(unsigned long nr) { unsigned long local_free_page = (node_page_state(numa_node_id(), NR_INACTIVE_FILE) + node_page_state(numa_node_id(), NR_FREE_PAGES)); unsigned long sane_nr = min(nr, MAX_REMOTE_READAHEAD); return (local_free_page ? min(nr, local_free_page / 2) : sane_nr); } or we can enforce 16MB limit for all the case too. I 'll send a patch accordingly. (readahead max will scale accordingly if we dont have 4k page size above). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists