[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170209084016.GL13195@ZenIV.linux.org.uk>
Date: Thu, 9 Feb 2017 08:40:16 +0000
From: Al Viro <viro@...IV.linux.org.uk>
To: Konstantin Khlebnikov <koct9i@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] proc/sysctl: drop unregistered stale dentries as soon as
possible
On Thu, Feb 09, 2017 at 10:36:15AM +0300, Konstantin Khlebnikov wrote:
> Ok, Thank you. I've expected that this fix isn't sane,
>
> Maybe we could minimize changes for now. For example: keep these
> stale dentries in memory but silently unhash them in ->d_compare().
> Memory processure and reclaimer will kill them later.
->d_compare() is called by the code walking the hash chains. What's worse,
in the most common case all we have is rcu_read_lock(). Modifying the chain
in rcu reader is no-go. Turning __d_lookup_rcu() into a writer on the
off-chance that we'll walk onto a visibly stale sysctl dentry - even more so.
If you want to deal with that, do it right, please. Have sysctl inodes
on a list of some kind anchored in struct ctl_table_header; insert them
there in proc_sys_make_inode(), remove - in proc_evict_inode() (or
have it pass the inode to sysctl_head_put() and do the removal there).
Use sysctl_lock for serialization.
In start_unregistering(), just before the erase_header() call, check
if the list is non-empty and if it is -
grab sysctl_lock
last = NULL
walk the list
igrab(inode we are looking at)
if succeeded
drop sysctl_lock
iput(last)
last = that inode
d_prune_aliases(last)
retake sysctl_lock
// inode is still not evicted, so it's still on the list
drop sysctl_lock
iput(last)
list would pass through struct proc_inode, and I would probably use
hlist rather than the normal one; might be more convenient to initialize
that way. Getting from containing struct proc_inode to inode - &ei->vfs_inode.
It's not that much work; if you have time - go for it, or remind me after
-rc1...
Powered by blists - more mailing lists