[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <999b101e-329a-d8d3-0dec-3714aac828fe@huawei.com>
Date: Wed, 1 Jun 2022 09:27:09 +0800
From: Zhihao Cheng <chengzhihao1@...wei.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
CC: <linux-kernel@...r.kernel.org>, <linux-fsdevel@...r.kernel.org>,
<yukuai3@...wei.com>, <yi.zhang@...wei.com>,
Alexey Gladkov <legion@...nel.org>
Subject: Re: [PATCH RFC] proc: Fix a dentry lock race between release_task and
lookup
Hi,
> If I understand correctly you are saying that under some circumstances
> this code runs slow, and you are proposing an optimization.
>
> That optimization is to change the content of the pid->inodes list
> from all directories under that pid, to just the /proc/<tgid> and
> /proc/<tgid>/task/<pid>.
>
Yes and yes.
> The justification being that d_invalidate on the parent directory will
> invalidate all children. So only those two directories are interesting
> from a d_invalidate point of view.
>
Absolutely right.
> That seems like a valid optimization.
>
> This could also count as a regression fix if you can show how the
> performance changed poorly when the pid->inodes change was introduced
> and how the performance improves with your change. I currently only
> see that you hit a pathological case and you are correcting it.
There is a reproducer attached in commit msg:
https://bugzilla.kernel.org/show_bug.cgi?id=216054
Create 200 tasks, each task open one file for 50,000 times. Kill all
tasks when opened files exceed 10,000,000 (cat /proc/sys/fs/file-nr).
Before fix:
$ time killall -wq aa
real 4m40.946s # During this period, we can see 'ps' and 'systemd'
taking high cpu usage.
After fix:
$ time killall -wq aa
real 1min~ # During this period, we can see 'systemd' taking high cpu
usage.
>
> As for the actual code change I think it would be better to
> remove the code from proc_pid_make_inode and make a helper
> proc_pid_make_base_inode that performs the extra work of
> adding to the pid->list. Not adding a flag makes the code
> easier to follow.
>
Agree, will send a v2.
> Something like the code below.
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index d654ce7150fd..9d025e70ddc3 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1915,11 +1915,6 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
>
> /* Let the pid remember us for quick removal */
> ei->pid = pid;
> - if (S_ISDIR(mode)) {
> - spin_lock(&pid->lock);
> - hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
> - spin_unlock(&pid->lock);
> - }
>
> task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid);
> security_task_to_inode(task, inode);
> @@ -1932,6 +1927,27 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
> return NULL;
> }
>
> +struct inode *proc_pid_make_base_inode(struct super_block * sb,
> + struct task_struct *task, umode_t mode)
> +{
> + struct inode * inode;
> + struct proc_inode *ei;
> + struct pid *pid;
> +
> + inode = proc_pid_make_inode(sb, task, mode);
> + if (!inode)
> + return NULL;
> +
> + /* Let proc_flush_pid find this directory inode */
> + ei = PROC_I(inode);
> + pid = ei->pid;
> + spin_lock(&pid->lock);
> + hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
> + spin_unlock(&pid->lock);
> +
> + return inode;
> +}
> +
> int pid_getattr(struct user_namespace *mnt_userns, const struct path *path,
> struct kstat *stat, u32 request_mask, unsigned int query_flags)
> {
> @@ -3351,7 +3367,7 @@ static struct dentry *proc_pid_instantiate(struct dentry * dentry,
> {
> struct inode *inode;
>
> - inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
> + inode = proc_pid_make_base_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
> if (!inode)
> return ERR_PTR(-ENOENT);
>
> @@ -3650,7 +3666,7 @@ static struct dentry *proc_task_instantiate(struct dentry *dentry,
> struct task_struct *task, const void *ptr)
> {
> struct inode *inode;
> - inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
> + inode = proc_pid_make_base_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
> if (!inode)
> return ERR_PTR(-ENOENT);
>
>
> Eric
> .
>
Powered by blists - more mailing lists