[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sgjdde0v.fsf@x220.int.ebiederm.org>
Date: Thu, 13 Feb 2020 21:49:20 -0600
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Kernel Hardening <kernel-hardening@...ts.openwall.com>,
Linux API <linux-api@...r.kernel.org>,
Linux FS Devel <linux-fsdevel@...r.kernel.org>,
Linux Security Module <linux-security-module@...r.kernel.org>,
Akinobu Mita <akinobu.mita@...il.com>,
Alexey Dobriyan <adobriyan@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Daniel Micay <danielmicay@...il.com>,
Djalal Harouni <tixxdz@...il.com>,
"Dmitry V . Levin" <ldv@...linux.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Ingo Molnar <mingo@...nel.org>,
"J . Bruce Fields" <bfields@...ldses.org>,
Jeff Layton <jlayton@...chiereds.net>,
Jonathan Corbet <corbet@....net>,
Kees Cook <keescook@...omium.org>,
Oleg Nesterov <oleg@...hat.com>,
Solar Designer <solar@...nwall.com>
Subject: Re: [PATCH v8 07/11] proc: flush task dcache entries from all procfs instances
Al Viro <viro@...iv.linux.org.uk> writes:
> On Wed, Feb 12, 2020 at 12:35:04PM -0800, Linus Torvalds wrote:
>> On Wed, Feb 12, 2020 at 12:03 PM Al Viro <viro@...iv.linux.org.uk> wrote:
>> >
>> > What's to prevent racing with fs shutdown while you are doing the second part?
>>
>> I was thinking that only the proc_flush_task() code would do this.
>>
>> And that holds a ref to the vfsmount through upid->ns.
>>
>> So I wasn't suggesting doing this in general - just splitting up the
>> implementation of d_invalidate() so that proc_flush_task_mnt() could
>> delay the complex part to after having traversed the RCU-protected
>> list.
>>
>> But hey - I missed this part of the problem originally, so maybe I'm
>> just missing something else this time. Wouldn't be the first time.
>
> Wait, I thought the whole point of that had been to allow multiple
> procfs instances for the same userns? Confused...
Multiple procfs instances for the same pidns. Exactly.
Which would let people have their own set of procfs mount
options without having to worry about stomping on someone else.
The fundamental problem with multiple procfs instances per pidns
is there isn't an obvous place to put a vfs mount.
...
Which means we need some way to keep the file system from going away
while anyone in the kernel is running proc_flush_task.
One was I can see to solve this that would give us cheap readers, is to
have a percpu count of the number of processes in proc_flush_task.
That would work something like mnt_count.
Then forbid proc_kill_sb from removing any super block from the list
or otherwise making progress until the proc_flush_task_count goes
to zero.
f we wanted cheap readers and an expensive writer
kind of flag that proc_kill_sb can
Thinking out loud perhaps we have add a list_head on task_struct
and a list_head in proc_inode. That would let us find the inodes
and by extention the dentries we care about quickly.
Then in evict_inode we could remove the proc_inode from the list.
Eric
Powered by blists - more mailing lists