[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57A1FCE5.3040206@kyup.com>
Date: Wed, 3 Aug 2016 17:17:09 +0300
From: Nikolay Borisov <kernel@...p.com>
To: Jeff Layton <jlayton@...chiereds.net>, bfields@...ldses.org
Cc: viro@...iv.linux.org.uk, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, ebiederm@...ssion.com,
containers@...ts.linux-foundation.org,
Andrey Vagin <avagin@...nvz.org>, xemul@...tuozzo.com
Subject: Re: [PATCH v2] locks: Filter /proc/locks output on proc pid ns
On 08/03/2016 04:46 PM, Jeff Layton wrote:
> On Wed, 2016-08-03 at 10:35 +0300, Nikolay Borisov wrote:
>> On busy container servers reading /proc/locks shows all the locks
>> created by all clients. This can cause large latency spikes. In my
>> case I observed lsof taking up to 5-10 seconds while processing around
>> 50k locks. Fix this by limiting the locks shown only to those created
>> in the same pidns as the one the proc was mounted in. When reading
>> /proc/locks from the init_pid_ns show everything.
>>
>>> Signed-off-by: Nikolay Borisov <kernel@...p.com>
>> ---
>> fs/locks.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/fs/locks.c b/fs/locks.c
>> index ee1b15f6fc13..751673d7f7fc 100644
>> --- a/fs/locks.c
>> +++ b/fs/locks.c
>> @@ -2648,9 +2648,15 @@ static int locks_show(struct seq_file *f, void *v)
>> {
>>> struct locks_iterator *iter = f->private;
>>> struct file_lock *fl, *bfl;
>>> + struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info;
>>> + struct pid_namespace *current_pidns = task_active_pid_ns(current);
>>
>>> fl = hlist_entry(v, struct file_lock, fl_link);
>>
>>>> + if ((current_pidns != &init_pid_ns) && fl->fl_nspid
>
> Ok, so when you read from a process that's in the init_pid_ns
> namespace, then you'll get the whole pile of locks, even when reading
> this from a filesystem that was mounted in a different pid_ns?
>
> That seems odd to me if so. Any reason not to just uniformly use the
> proc_pidns here?
[CCing some people from openvz/CRIU]
My train of thought was "we should have means which would be the one
universal truth about everything and this would be a process in the
init_pid_ns". I don't have strong preference as long as I'm not breaking
userspace. As I said before - I think the CRIU guys might be using that
interface.
>
>>>> + && (proc_pidns != ns_of_pid(fl->fl_nspid)))
>>> + return 0;
>> +
>>> lock_get_status(f, fl, iter->li_pos, "");
>>
>>> list_for_each_entry(bfl, &fl->fl_block, fl_block)
>
Powered by blists - more mailing lists