linux-kernel - Re: [BUG?] vfs_cache_pressure=0 does not free inode caches

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Date:	Tue, 11 May 2010 10:47:26 +0200
From:	"Alexander Stohr" <Alexander.Stohr@....de>
To:	linux-kernel@...r.kernel.org
Cc:	akpm@...l.org, trond.myklebust@....uio.no, riel@...hat.com,
	major@...x.net
Subject: Re: [BUG?] vfs_cache_pressure=0 does not free inode caches

Andrew Morton wrote:
> "Alexander Stohr" <Alexander.Stohr@...xxx> wrote:

> > i'm running an embedded system with NFS as my working area.
> > the system has only few ram leftover, any MiBi counts.
> >
> > my current best guess to resolve low memory situations
> > is a manual one (no, i could not see any smart kernel reaction
> > with that relatively old but patched 2.6.18 kernel) is this:
> >
> > echo 100000 >/proc/sys/vm/vfs_cache_pressure
> > sync
> > echo 1 >/proc/sys/vm/drop_caches
> > echo 2 >/proc/sys/vm/drop_caches

> I'm not sure what to say, really.

Thanks for your honest and helpful reply.

> If you tell the kernel not to reclaim inode/dentry caches then it will
> do what you asked. It _sounds_ like you're looking for more aggressive
> reclaim of the VFS caches when the system is getting low on memory.
> Perhaps this can be done by _increasing_ vfs_cache_pressure.

Yes, thats the method i used already. Its probably not impacting that much as caches still steadily grow as you read inodes and that value wont stop growing for the sizes i am having in my setup (<20 MB). obviously there is no timer for auto-dropping (i think i just confused it with needed timed auto-flushing of disk write data).

Doing a test drive with top, slabtop and a c coded malloc helper program that touches any byte in a heap allocated memory portion made the image clearer. (As its intended to be an embedded system there is no swap.)

step 1: fill caches by running "ls -lR /"; abort it at some 5 MB cache counter in top
step 2: run the malloc helper with increasing memory specification (10MB to 21 MB) until OOM killer hits it the first time; cache counter drops down to 1,8 MB
step 3: write "3" to drop_caches (without and with prior sync); cache counter drops again down to 1,2 MB, dentry/inode/nfs_inode/shm_inode cache values still stay at a total of some 0,5 kB - that equals to a bit more than 100 4k pages or some 1000 512byte disk sectors.

(step 4: having manually dropped the caches did _not_ result in the application being able to run using any more memory - i am puzzled.)

> But the
> kernel should wring the last drop out of the VFS caches before
> declaring OOM anyway - if it isn't doing that, we should fix it.

Its not dropping anything out of the slab based areas as these areas are just the kernels internal heap system but some of this memory represents caches and for that the slab footprint definitely shrinks down. Its just not anything in these caches that gets dropped (that far as it can diagnosed with a few loaded diagnostics applications still alive). when using memory via an application a noticeable amount still stays in. when triggering manually another chunk gets dropped but finally there is still some memory dedicated to caches. not that i worry too much about that now.

> Perhaps you could tell us exactly what behaviour you're observing, and
> how it differs from what you'd like to see.

partly done above. i would expect to see the kernels memory allocator to drain the caches to the same level as it can be done manually (and without any urgent system need) using the drop_caches interface from proc.

> > http://rackerhacker.com/2008/12/03/reducing-inode-and-dentry-caches-to-keep-oom-killer-at-bay/
> drop_caches only drops stuff which has been written back.

thanks for commenting on that.
in contrast to the opinion on this web page i assumed this to be a non-critical operation, else e.g. machines used in benchmarks would have a risky and short lifetime.

so whats left over? the cache size reported by top is not dropping below 1,2 MB and the rough sum of cache related data reported by slabtop is some 500 kB that looked pretty persistent in the test. the kernel functionality automatically invoked when peak memory amounts are requested will drop lesser caches than when cache drop is requested explicitly. dropping more cache memory does not impact on the application - thats probably the least expected result i got.
-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/