lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20130718130611.GB14274@gmail.com> Date: Thu, 18 Jul 2013 21:06:11 +0800 From: Zheng Liu <gnehzuil.liu@...il.com> To: Theodore Ts'o <tytso@....edu> Cc: Ext4 Developers List <linux-ext4@...r.kernel.org> Subject: Re: [PATCH 5/5] ext4: add new ioctl EXT4_IOC_PRECACHE_EXTENTS Hi Ted, Thanks for your explanation. I can always learn something from your reply. :-) On Wed, Jul 17, 2013 at 10:50:25PM -0400, Theodore Ts'o wrote: > On Thu, Jul 18, 2013 at 09:19:41AM +0800, Zheng Liu wrote: > > > > If I understand correctly, we don't want to reclaim from an inode with > > EXT4_STATE_EXT_PRECACHED flag when __ratelimit() returns 0, right? > > No, the intent of the code was to make sure we don't trigger the > warning too often, in case the system is under massive memory > pressure. In the original implementation of this ioctl which we used > at Google (with an extent cache that was much less functional than the > extent status tree we now have upstream), the extents were pinned in > memory permanently, until the inode is evicted from memory. > > I thought about doing this, since normally the cached extents will > take less memory than the extent tree in the buffer cache (especially > in any sane setup where the large tablespace, etc., files are are > fallocated in advance and are largely contiguous). But for upstream, > I was concerned that someone might deliberately create lots of > fragmented files, and then call the precache ioctl on all of them. Yes, at least for a internet company we can control everything, but for upstream the kernel might run under some weird environments. The lesson from this is that I need to think deeply for non-internet applications, and make a better design. Now I fully agree with you about this implementation. Meanwhile the patch looks good to me. Reviewed-by: Zheng Liu <wenqing.lz@...bao.com> - Zheng > > So what I did was to change the sort function such that the shrinker > would put those files at the end of the list. And although it's not > in the patch that I've sent out, I've since changed it so that if the > head of the list is an precached inode, and it's been more than 5 > seconds, we force a resort of the list. > > That way if we are under heavy memory pressure, we will eventually get > rid of the precached extents --- but under normal circumstnaces, we > try very hard not to, at least via the es_shrinker. (If the inode > gets closed, and then eventually the inode gets evicted, then of > course we'll drop all of the precached extents.) > > So the ratelimited warning is so we can know if this has happened, > since it's probably a sign that something bad has happened. Either a > process ran wild trying to precache too many extents, or the system > was under far more memory pressure, which is probably something that > needs to be fixed by changing some configuration parameter or by > tweaking the load balancer. > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@...r.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists