lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 10 Oct 2016 17:39:07 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Nikolay Borisov <kernel@...p.com>
Cc:     Jan Kara <jack@...e.cz>, John McCutchan <john@...nmccutchan.com>,
        Eric Paris <eparis@...isplace.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        "Serge E. Hallyn" <serge@...lyn.com>,
        Andrey Vagin <avagin@...nvz.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux Containers <containers@...ts.linux-foundation.org>
Subject: Re: [PATCH] inotify: Convert to using per-namespace limits

Nikolay Borisov <kernel@...p.com> writes:

> On Mon, Oct 10, 2016 at 11:49 PM, Eric W. Biederman
> <ebiederm@...ssion.com> wrote:
>> Jan Kara <jack@...e.cz> writes:
>>
>>> On Mon 10-10-16 09:44:19, Nikolay Borisov wrote:
>>>> On 10/07/2016 09:14 PM, Eric W. Biederman wrote:
>>>> > Nikolay Borisov <kernel@...p.com> writes:
>>>> >
>>>> >> This patchset converts inotify to using the newly introduced
>>>> >> per-userns sysctl infrastructure.
>>>> >>
>>>> >> Currently the inotify instances/watches are being accounted in the
>>>> >> user_struct structure. This means that in setups where multiple
>>>> >> users in unprivileged containers map to the same underlying
>>>> >> real user (i.e. pointing to the same user_struct) the inotify limits
>>>> >> are going to be shared as well, allowing one user(or application) to exhaust
>>>> >> all others limits.
>>>> >>
>>>> >> Fix this by switching the inotify sysctls to using the
>>>> >> per-namespace/per-user limits. This will allow the server admin to
>>>> >> set sensible global limits, which can further be tuned inside every
>>>> >> individual user namespace.
>>>> >>
>>>> >> Signed-off-by: Nikolay Borisov <kernel@...p.com>
>>>> >> ---
>>>> >> Hello Eric,
>>>> >>
>>>> >> I saw you've finally sent your pull request for 4.9 and it
>>>> >> includes your implementatino of the ucount infrastructure. So
>>>> >> here is my respin of the inotify patches using that.
>>>> >
>>>> > Thanks.  I will take a good hard look at this after -rc1 when things are
>>>> > stable enough that I can start a new development branch.
>>>> >
>>>> > I am a little concerned that the old sysctls have gone away.  If no one
>>>> > cares it is fine, but if someone depends on them existing that may count
>>>> > as an unnecessary userspace regression.  But otherwise skimming through
>>>> > this code it looks good.
>>>>
>>>> So this indeed this is real issue and I meant to write something about
>>>> it. Anyway, in order to preserve those sysctl what can be done is to
>>>> hook them up with a custom sysctl handler taking the ns from the proc
>>>> mount and the euid of current? I think this is a good approach, but
>>>> let's wait and see if anyone will have objections to completely
>>>> eliminating those sysctls.
>>>
>>> Well, I believe just discarding those sysctls is not an option - I'm pretty
>>> sure there are scripts out there which tune these sysctls and those would
>>> stop working. IMO not acceptable regression.
>>
>> Nikolay there is your objection.
>>
>> So since it should be straight forward let's preserve the existing
>> sysctls.  Then this change doesn't need to prove there are no scripts
>> that tweak those sysctls.
>>
>> We are just talking changing the values in the initial user namespace so
>> it should be completely compatible and straight forward to implement
>> unless I am missing something.
>
> Well I'm not so sure about this. Let's say those sysctls are going to
> modify the ucount values in the init_user_ns. That's fine, however for
> which particular user should they do this ? Should it be hardcoded for
> kuid 0? or current_euid? I personally think they should be changing
> the values for the current_euid.

Unless I have missed something the limits are per user namespace.  The
counts are per user in that namespace.  Certainly that is what the rest
of the ucount infrastructure is doing.

At which point having the existing sysctls simply update the limit in
the initial user namespace should result in no change.

Eric

Powered by blists - more mailing lists