linux-kernel - Re: [RESEND RFC PATCH 1/1] Selectively allow CAP_SYS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9b2dd6f5-5b0b-9c9b-e853-5795c352e092@oracle.com>
Date:   Mon, 18 Nov 2019 16:46:08 -0800
From:   "prakash.sangappa" <prakash.sangappa@...cle.com>
To:     Jann Horn <jannh@...gle.com>
Cc:     kernel list <linux-kernel@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        "Serge E. Hallyn" <serge@...lyn.com>,
        Christian Brauner <christian@...uner.io>
Subject: Re: [RESEND RFC PATCH 1/1] Selectively allow CAP_SYS_NICE capability
 inside user namespaces



On 11/18/2019 11:30 AM, Jann Horn wrote:
> On Mon, Nov 18, 2019 at 6:04 PM Prakash Sangappa
> <prakash.sangappa@...cle.com> wrote:
>> Allow CAP_SYS_NICE to take effect for processes having effective uid of a
>> root user from init namespace.
> [...]
>> @@ -4548,6 +4548,8 @@ int can_nice(const struct task_struct *p, const int nice)
>>          int nice_rlim = nice_to_rlimit(nice);
>>
>>          return (nice_rlim <= task_rlimit(p, RLIMIT_NICE) ||
>> +               (ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE) &&
>> +               uid_eq(current_euid(), GLOBAL_ROOT_UID)) ||
>>                  capable(CAP_SYS_NICE));
> I very strongly dislike tying such a feature to GLOBAL_ROOT_UID.
> Wouldn't it be better to control this through procfs, similar to
> uid_map and gid_map? If you really need an escape hatch to become
> privileged outside a user namespace, then I'd much prefer a file
> "cap_map" that lets someone with appropriate capabilities in the outer
> namespace write a bitmask of capabilities that should have effect
> outside the container, or something like that. And limit that to bits
> where that's sane, like CAP_SYS_NICE.

Sounds reasonable. Adding a 'cap_map' file to user namespace, would give 
more control. We could allow the  capability in 'cap_map' to take effect 
only if corresponding capability is enabled for the user inside the user 
namespace Ex uid 0. Start with support for CAP_SYS_NICE?


>
> If we tie features like this to GLOBAL_ROOT_UID, more people are going
> to run their containers with GLOBAL_ROOT_UID. Which is a terrible,
> terrible idea. GLOBAL_ROOT_UID gives you privilege over all sorts of
> files that you shouldn't be able to access, and only things like mount
> namespaces and possibly LSMs prevent you from exercising that
> privilege. GLOBAL_ROOT_UID should only ever be given to processes that
> you trust completely.

Agreed.