lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 08 Jan 2014 10:06:31 +0800
From:	Li Wang <liwang@...ntukylin.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	Alexander Viro <viro@...iv.linux.org.uk>,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Cong Wang <xiyou.wangcong@...il.com>,
	Zefan Li <lizefan@...wei.com>, Matthew Wilcox <matthew@....cx>,
	Yunchuan Wen <yunchuanwen@...ntukylin.com>,
	Dave Chinner <david@...morbit.com>
Subject: Re: [PATCH 2/3] Add shrink_pagecache_parent

Hi,

On 01/03/2014 07:55 AM, Andrew Morton wrote:
> On Mon, 30 Dec 2013 21:45:17 +0800 Li Wang <liwang@...ntukylin.com> wrote:
>
>> Analogous to shrink_dcache_parent except that it collects inodes.
>> It is not very appropriate to be put in dcache.c, but d_walk can only
>> be invoked from here.
>
> Please cc Dave Chinner on future revisions.  He be da man.
>
> The overall intent of the patchset seems reasonable and I agree that it
> can't be efficiently done from userspace with the current kernel API.
> We *could* do it from userspace by providing facilities for userspace to
> query the VFS caches: "is this pathname in the dentry cache" and "is
> this inode in the inode cache".
>
Even we have these available, i am afraid it will still introduce
non-negligible overhead due to frequent system calls for a directory
  walking operation, especially under massive small file situations.

>> --- a/fs/dcache.c
>> +++ b/fs/dcache.c
>> @@ -1318,6 +1318,42 @@ void shrink_dcache_parent(struct dentry *parent)
>>   }
>>   EXPORT_SYMBOL(shrink_dcache_parent);
>>
>> +static enum d_walk_ret gather_inode(void *data, struct dentry *dentry)
>> +{
>> +	struct list_head *list = data;
>> +	struct inode *inode = dentry->d_inode;
>> +
>> +	if ((inode == NULL) || ((!inode_owner_or_capable(inode)) &&
>> +				(!capable(CAP_SYS_ADMIN))))
>> +		goto out;
>> +	spin_lock(&inode->i_lock);
>> +	if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
>
> It's unclear what rationale lies behind this particular group of tests.
>
>> +		(inode->i_mapping->nrpages == 0) ||
>> +		(!list_empty(&inode->i_lru))) {
>
> arg, the "Inode locking rules" at the top of fs/inode.c needs a
> refresh, I suspect.  It is too vague.
>
> Formally, inode->i_lru is protected by
> i_sb->s_inode_lru->node[nid].lock, not by ->i_lock.  I guess you can
> just do a list_lru_add() and that will atomically add the inode to your
> local list_lru if ->i_lru wasn't being used for anything else.
>
> I *think* that your use of i_lock works OK, because code which fiddles
> with i_lru and s_inode_lru also takes i_lock.  However we need to
> decide which is the preferred and official lock.  ie: what is the
> design here??
>
> However...  most inodes will be on an LRU list, won't they?  Doesn't
> this reuse of i_lru mean that many inodes will fail to be processed?
> If so, we might need to add a new list_head to the inode, which will be
> problematic.
>
As far as I know, fix me if i am wrong, only when inode has zero
reference count, it will be put into superblock lru list. For most
situations, there is at least a dentry refers to it, so it will not
be on any lru list.

>
> Aside: inode_lru_isolate() fiddles directly with inode->i_lru without
> taking i_sb->s_inode_lru->node[nid].lock.  Why doesn't this make a
> concurrent s_inode_lru walker go oops??  Should we be using
> list_lru_del() in there?  (which should have been called
> list_lru_del_init(), sigh).
>
It seems inode_lru_isolate() only called by prune_icache_sb() as
a callback function. Before calling it, the caller has hold
the lock.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ