linux-kernel - Re: [PATCH] ucounts: Fix UCOUNT_RLIMIT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=whfoVN6wiP5VHekckvqivRhpB+b1FnwyWEjz1SB2FN6HQ@mail.gmail.com>
Date:   Fri, 2 Jul 2021 15:13:18 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Alexey Gladkov <legion@...nel.org>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux Containers <containers@...ts.linux.dev>
Subject: Re: [PATCH] ucounts: Fix UCOUNT_RLIMIT_SIGPENDING counter leak

On Fri, Jul 2, 2021 at 10:55 AM Alexey Gladkov <legion@...nel.org> wrote:
>
> @@ -424,10 +424,10 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
>          * changes from/to zero.
>          */
>         rcu_read_lock();
> -       ucounts = task_ucounts(t);
> +       ucounts = ucounts_new = task_ucounts(t);
>         sigpending = inc_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1);
>         if (sigpending == 1)
> -               ucounts = get_ucounts(ucounts);
> +               ucounts_new = get_ucounts(ucounts);
>         rcu_read_unlock();

I think this is still problematic.

If get_ucounts() fails, we can't just drop the RCU lock and (later)
use "ucounts" that we hold no reference to.

Or am I missing something? I'm not entirely sure about the lifetime of
that RCU protection, and I do note that "task_ucounts()" uses
"task_cred_xxx()", which already does
rcu_read_lock()/rcu_read_unlock() in the actual access.

So I'm thinking the code could/should be written something like this instead:

  diff --git a/kernel/signal.c b/kernel/signal.c
  index f6371dfa1f89..40781b197227 100644
  --- a/kernel/signal.c
  +++ b/kernel/signal.c
  @@ -422,22 +422,33 @@ __sigqueue_alloc(int sig, struct task_struct
*t, gfp_t gfp_flags,
         * NOTE! A pending signal will hold on to the user refcount,
         * and we get/put the refcount only when the sigpending count
         * changes from/to zero.
  +      *
  +      * And if the ucount rlimit overflowed, we do not get to use it at all.
         */
        rcu_read_lock();
        ucounts = task_ucounts(t);
        sigpending = inc_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1);
  -     if (sigpending == 1)
  -             ucounts = get_ucounts(ucounts);
  +     switch (sigpending) {
  +     case 1:
  +             if (likely(get_ucounts(ucounts)))
  +                     break;
  +
  +             dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1);
  +             fallthrough;
  +     case LONG_MAX:
  +             rcu_read_unlock();
  +             return NULL;
  +     }
        rcu_read_unlock();

  -     if (override_rlimit || (sigpending < LONG_MAX && sigpending <=
task_rlimit(t, RLIMIT_SIGPENDING))) {
  +     if (override_rlimit || sigpending <= task_rlimit(t,
RLIMIT_SIGPENDING)) {
                q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
        } else {
                print_dropped_signal(sig);
        }

        if (unlikely(q == NULL)) {
  -             if (ucounts && dec_rlimit_ucounts(ucounts,
UCOUNT_RLIMIT_SIGPENDING, 1))
  +             if (dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1))
                        put_ucounts(ucounts);
        } else {
                INIT_LIST_HEAD(&q->list);

(and no, I'm not sure it's a good idea to make that use a "switch()" -
maybe the LONG_MAX case should be a "if (unlikely())" thing after the
rcu_read_ulock() instead?

Hmm?

The alternate thing is to say "No, Linus, you're a nincompoop and
wrong, that RCU protection is a non-issue because we hold a reference
to the task, and task_ucounts() will not change, so the RCU read lock
doesn't do anything".

Although then I would think the rcu_read_lock/rcu_read_unlock here is
entirely pointless.

               Linus