linux-kernel - Re: How should rlimits, suid exec, and capabilities interact?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878ru1qcos.fsf@email.froward.int.ebiederm.org>
Date:   Wed, 23 Feb 2022 19:24:19 -0600
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Linus Torvalds <linus@...valds.org>
Cc:     Linux API <linux-api@...r.kernel.org>,
        Etienne Dechamps <etienne@...champs.fr>,
        Alexey Gladkov <legion@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Shuah Khan <shuah@...nel.org>,
        Christian Brauner <brauner@...nel.org>,
        Solar Designer <solar@...nwall.com>,
        Ran Xiaokai <ran.xiaokai@....com.cn>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        <linux-kselftest@...r.kernel.org>,
        Linux Containers <containers@...ts.linux-foundation.org>,
        Michal Koutný <mkoutny@...e.com>,
        Security Officers <security@...nel.org>,
        Neil Brown <neilb@....unsw.edu.au>, NeilBrown <neilb@...e.de>,
        "Serge E. Hallyn" <serge@...lyn.com>, Jann Horn <jannh@...gle.com>,
        Andy Lutomirski <luto@...nel.org>, Willy Tarreau <w@....eu>
Subject: Re: How should rlimits, suid exec, and capabilities interact?


Linus Torvalds <linus@...valds.org> writes:

> Basic rule: it's better to be too lenient than to be too strict.

Thank you.  With that guideline I can explore the space of what is
possible.

Question: Running a suid program today charges the activity of that
program to the user who ran that program, not to the user the program
runs as.  Does anyone see a problem with charging the user the program
runs as?

The reason I want to change who is charged with a process (besides it
making more sense in my head) is so that capable(CAP_SYS_RESOURCE) can
be used instead of the magic incantation (cred->user == INIT_USER).

An accidental experiment happened in v5.14-rc1 in July when the ucount
rlimit code was merged.  It was only this last week when after Michal
Koutný discovered the discrepency through code inspect a bug fix was
merged.

This changes the behavior that has existed in some form since Linux v1.0
when per user process limits were added.

The original code in v1.0 looked like:
> static int find_empty_process(void)
> {
>        int free_task;
>        int i, tasks_free;
>        int this_user_tasks;
> 
> repeat:
>        if ((++last_pid) & 0xffff8000)
>                last_pid=1;
>        this_user_tasks = 0;
>        tasks_free = 0;
>        free_task = -EAGAIN;
>        i = NR_TASKS;
>        while (--i > 0) {
>                if (!task[i]) {
>                        free_task = i;
>                        tasks_free++;
>                        continue;
>                }
>                if (task[i]->uid == current->uid)
>                        this_user_tasks++;
>                if (task[i]->pid == last_pid || task[i]->pgrp == last_pid ||
>                    task[i]->session == last_pid)
>                        goto repeat;
>        }
>        if (tasks_free <= MIN_TASKS_LEFT_FOR_ROOT ||
>            this_user_tasks > MAX_TASKS_PER_USER)
>                if (current->uid)
>                        return -EAGAIN;
>        return free_task;
> }

Having tracked the use of real uid in limits back this far my guess
is that it was an accident of the implementation and real uid vs
effective uid had not be considered.

Does anyone know if choosing the real uid was a deliberate decision
anywhere in the history of Linux?



Linus you were talking about making it possible to login as I think a
non-root user to be able to use sudo and kill a fork bomb.

The counter case is apache having a dedicated user for running
cgi-scripts and using RLIMIT_NPROC to limit how many of those processes
can exist.  Unless I am misunderstanding something that looks exactly
like your login as non-root so you can run sudo to kill a fork-bomb.

A comment from an in-process cleanup patch explains this as best I can:
         /*
         * In general rlimits are only enforced when a new resource
         * is acquired.  That would be during fork for RLIMIT_NPROC.
         * That is insufficient for RLIMIT_NPROC as many attributes of
         * a new process must be set between fork and exec.
         *
         * A case where this matter is when apache runs forks a process
         * and calls setuid to run cgi-scripts as a different user.
         * Generating those processes through a code sequence like:
         *
         *      fork()
 	 *	setrlimit(RLIMIT_NPROC, ...)         
         *      execve()  -- suid wrapper
         *      setuid()
         *      execve()  -- cgi script
         *
         * The cgi-scripts are unlikely to fork on their own so unless
         * RLIMIT_NPROC is checked after the user change and before
         * the cgi-script starts, RLIMIT_NPROC simply will not be enforced
         * for the cgi-scripts.
         *
         * So the code tracks if between fork and exec if an operation
         * occurs that could cause the RLIMIT_NPROC check to fail.  If
         * such an operation has happened re-check RLIMIT_NPROC.
         */


Answered-Question: I was trying to ask if anyone knows of a reason why
we can't just sanitize the rlimits of the process during suid exec?
Linus your guideline would appear to allow that behavior.  Unfortunately
that looks like it would break current usage of apache suexec.

Eric