[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87d0wetyh2.fsf@xmission.com>
Date: Mon, 25 Jun 2018 15:14:49 -0500
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Linux Containers <containers@...ts.linux-foundation.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
astrachan@...gle.com, Andrew Morton <akpm@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
David Howells <dhowells@...hat.com>,
Oleg Nesterov <oleg@...hat.com>,
Alexey Dobriyan <adobriyan@...il.com>
Subject: Re: [GIT PULL] userns fixes for 4.17-rc2
Linus Torvalds <torvalds@...ux-foundation.org> writes:
> On Tue, Jun 19, 2018 at 8:24 PM Eric W. Biederman <ebiederm@...ssion.com> wrote:
>>
>> I stared at this code for quite a while and I finally concluded that the
>> best course forward is to simply things and remove the internal kernel
>> mount of proc. The internal mount of proc is directly responsible for
>> this regression and it has been the source of pain over the years.
>
> This is not the kind of patch that I'm willing to take outside the
> merge window. This is *way* too subtle, and making sysctl do a
> kern_mount()/kern_umount() seems odd.
I understand the feedback about breaking up the patch and the concern
about the race with pid->count.
I don't understand the feedback about only accepting something like this
during the merge window. The entire point of my change was to remove
subtlety. The code was very straight forward to test.
This is a silent regression of a security feature so it is possible some
people have upgraded their kernel and not noticed the regression but are
affected by the information leak not honoring hidepid introduces. That
seems to me to be a candidate for stable and thus an rc kernel.
Would you prefer a patch that does less towards fixing the root cause
for now and to be backported to stable?
> The pid->count test also looks potentially racy to me.
The function proc_flush_task is already racy, it is just an optimization
that needs to work the vast majority of the time or we get lots of stale
useless cached dentries in proc. So I don't think a little race
between testing pid->count and someone accessing a proc inode matters.
They could always perform the access after proc_flush_task is done
and before unhash_process runs, and achieve the same effect.
Though in retrospect my testing showed processes acessing proc self
from libc or something so the pid->count optimization never really
hit. So it is probably better just to remove it.
The kern_mount/kern_umount are definitely odd and not my favorite. But
the code does work. It is my intention and hope that they can both the
uml and the sysctl(2) code can both be removed. I need to double check
but I don't think there are even any enterprise kernels that enable
sysctl(2) support in the kernel any more.
Eric
Powered by blists - more mailing lists