lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pn3ag2ut.fsf@x220.int.ebiederm.org>
Date:   Tue, 15 Dec 2020 15:45:46 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Stephen Brennan <stephen.s.brennan@...cle.com>
Cc:     Alexey Dobriyan <adobriyan@...il.com>,
        James Morris <jmorris@...ei.org>,
        "Serge E. Hallyn" <serge@...lyn.com>,
        linux-security-module@...r.kernel.org,
        Paul Moore <paul@...l-moore.com>,
        Stephen Smalley <stephen.smalley.work@...il.com>,
        Eric Paris <eparis@...isplace.org>, selinux@...r.kernel.org,
        Casey Schaufler <casey@...aufler-ca.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH v2] proc: Allow pid_revalidate() during LOOKUP_RCU

Stephen Brennan <stephen.s.brennan@...cle.com> writes:

> ebiederm@...ssion.com (Eric W. Biederman) writes:
>
>> Stephen Brennan <stephen.s.brennan@...cle.com> writes:
>>
>>> The pid_revalidate() function requires dropping from RCU into REF lookup
>>> mode. When many threads are resolving paths within /proc in parallel,
>>> this can result in heavy spinlock contention as each thread tries to
>>> grab a reference to the /proc dentry lock (and drop it shortly
>>> thereafter).
>>
>> I am feeling dense at the moment.  Which lock specifically are you
>> referring to?  The only locks I can thinking of are sleeping locks,
>> not spinlocks.
>
> The lock in question is the d_lockref field (aliased as d_lock) of
> struct dentry. It is contended in this code path while processing the
> "/proc" dentry, switching from RCU to REF mode.
>
>     walk_component()
>       lookup_fast()
>         d_revalidate()
>           pid_revalidate() // returns -ECHILD
>         unlazy_child()
>           lockref_get_not_dead(&nd->path.dentry->d_lockref)
>
>>
>>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>>> index ebea9501afb8..833d55a59e20 100644
>>> --- a/fs/proc/base.c
>>> +++ b/fs/proc/base.c
>>> @@ -1830,19 +1846,22 @@ static int pid_revalidate(struct dentry *dentry, unsigned int flags)
>>>  {
>>>  	struct inode *inode;
>>>  	struct task_struct *task;
>>> +	int rv = 0;
>>>  
>>> -	if (flags & LOOKUP_RCU)
>>> -		return -ECHILD;
>>> -
>>> -	inode = d_inode(dentry);
>>> -	task = get_proc_task(inode);
>>> -
>>> -	if (task) {
>>> -		pid_update_inode(task, inode);
>>> -		put_task_struct(task);
>>> -		return 1;
>>> +	if (flags & LOOKUP_RCU) {
>>
>> Why do we need to test flags here at all?
>> Why can't the code simply take an rcu_read_lock unconditionally and just
>> pass flags into do_pid_update_inode?
>>
>
> I don't have any good reason. If it is safe to update the inode without
> holding a reference to the task struct (or holding any other lock) then
> I can consolidate the whole conditional.


I am not certain if there is anything that makes it safe to change uid
and gid on the inode during lookup.  The current code might be buggy in
that respect.

However it is absoltely safe to read from the task_struct with just the
rcu_read_lock.  Which means it is only do_pid_update_inode that needs
locking to update the inode.

>>> +		inode = d_inode_rcu(dentry);
>>> +		task = pid_task(proc_pid(inode), PIDTYPE_PID);
>>> +		if (task)
>>> +			rv = do_pid_update_inode(task, inode, flags);
>>> +	} else {
>>> +		inode = d_inode(dentry);
>>> +		task = get_proc_task(inode);
>>> +		if (task) {
>>> +			rv = do_pid_update_inode(task, inode, flags);
>>> +			put_task_struct(task);
>>> +		}
>>
>>>  	}
>>> -	return 0;
>>> +	return rv;
>>>  }
>>>  
>>>  static inline bool proc_inode_is_dead(struct inode *inode)
>>
>> Eric

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ