linux-kernel - Re: [PATCH 0/5] VFS: Directory level cache cleaning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131217220503.GA20579@dastard>
Date:	Wed, 18 Dec 2013 09:05:03 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Li Wang <liwang@...ntukylin.com>
Cc:	Alexander Viro <viro@...iv.linux.org.uk>,
	Sage Weil <sage@...tank.com>, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Yunchuan Wen <yunchuanwen@...ntukylin.com>
Subject: Re: [PATCH 0/5] VFS: Directory level cache cleaning

On Mon, Dec 16, 2013 at 07:00:04AM -0800, Li Wang wrote:
> Currently, Linux only support file system wide VFS
> cache (dentry cache and page cache) cleaning through
> '/proc/sys/vm/drop_caches'. Sometimes this is less
> flexible. The applications may know exactly whether
> the metadata and data will be referenced or not in future,
> a desirable mechanism is to enable applications to
> reclaim the memory of unused cache entries at a finer
> granularity - directory level. This enables applications
> to keep hot metadata and data (to be referenced in the
> future) in the cache, and kick unused out to avoid
> cache thrashing. Another advantage is it is more flexible
> for debugging.
>
> This patch extend the 'drop_caches' interface to
> support directory level cache cleaning and has a complete
> backward compatibility. '{1,2,3}' keeps the same semantics
> as before. Besides, "{1,2,3}:DIRECTORY_PATH_NAME" is allowed
> to recursively clean the caches under DIRECTORY_PATH_NAME.
> For example, 'echo 1:/home/foo/jpg > /proc/sys/vm/drop_caches'
> will clean the page caches of the files inside 'home/foo/jpg'.
> 
> It is easy to demonstrate the advantage of directory level
> cache cleaning. We use a virtual machine configured with
> an Intel(R) Xeon(R) 8-core CPU E5506 @ 2.13GHz, and with 1GB
> memory.  Three directories named '1', '2' and '3' are created,
> with each containing 180000 – 280000 files. The test program
> opens all files in a directory and then tries the next directory.
> The order for accessing the directories is '1', '2', '3',
> '1'.
> 
> The time on accessing '1' on the second time is measured
> with/without cache cleaning, under different file counts.
> With cache cleaning, we clean all cache entries of files
> in '2' before accessing the files in '3'. The results
> are as follows (in seconds),

This sounds like a highly contrived test case. There is no reason
why dentry cache access time would change going from 180k to 280k
files in 3 directories unless you're right at the memory pressure
balance point in terms of cache sizing.

> Note: by default, VFS will move those unreferenced inodes
> into a global LRU list rather than freeing them, for this
> experiment, we modified iput() to force to free inode as well,
> this behavior and related codes are left for further discussion,
> thus not reflected in this patch)
> 
> Number of files:   180000 200000 220000 240000 260000
> Without cleaning:  2.165  6.977  10.032 11.571 13.443
> With cleaning:     1.949  1.906  2.336  2.918  3.651
>
> When the number of files is 180000 in each directory,
> the metadata cache is large enough to buffer all entries
> of three directories, so re-accessing '1' will hit in
> the cache, regardless of whether '2' cleaned up or not.
> As the number of files increases, the cache can now only
> buffer two+ directories. Accessing '3' will result in some
> entries of '1' to be evicted (due to LRU). When re-accessing '1',
> some entries need be reloaded from disk, which is time-consuming.

Ok, so exactly as I thought - your example working set is slightly
larger than what the cache holds. Hence what you are describing is
a cache reclaim threshold effect: something you can avoid with
/proc/sys/vm/vfs_cache_pressure.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/