linux-kernel - Re: [GIT PULL] ucount fix for v5.14-rc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wjdfLQ+z=uM3qUPSb1wgnugeN5+wyH11kmatUXskYqrCA@mail.gmail.com>
Date:   Fri, 6 Aug 2021 10:37:52 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Hillf Danton <hdanton@...a.com>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Alexey Gladkov <legion@...nel.org>
Subject: Re: [GIT PULL] ucount fix for v5.14-rc

On Thu, Aug 5, 2021 at 11:15 PM Hillf Danton <hdanton@...a.com> wrote:
>
> On Thu, 05 Aug 2021 22:38:05 -0500 Eric W. Biederman wrote:
> >
> >I think you are saying if someone calls get_ucounts without knowing the
> >reference count is at least one a user after free will happen.  It is a
> >bug to call get_ucounts unless it is known the reference count is at
> >least one.
>
> Doubt it works because no atomicity is ensured by two atomic operations
> in tow.
>
>         if (atomic_read(&ucounts->count) >= 1)
>                 if (ucounts && atomic_add_negative(1, &ucounts->count))

Note that the first atomic_read() check is purely a debug check.

Eric's point is that you can't do "get_ucounts()" on an ucount that
you don't already have a reference to - that would be a bug even
*without* the "get_ucounts()", because you're clearly dealing with an
ucount that could be released at any time.

So you have only a couple of operations:

 (a) find_ucounts() looks up the ucount for a user that doesn't have a ref yet.

 (b) get_ucounts() increments a ref for somebody who already has a
ucount but is giving it away to somebody else too. We know the ref
can't go down to zero, because we are ourselves holding a ref to it.

 (c) put_ucounts() decrements a ref (and frees it when the refs go
down to zero).

Of these, (a) needs to be called under the lock, and needs to
increment the ref-count before the lock is released.

But (b) does *not* need a lock, exactly because the current getter
must already hold a ref, so it can't go away.

And (c) needs to hold a lock only for the "goes to zero" case, so you
can avoid the lock if the count started out higher than 1 (which is
why we have that atomic_dec_and_lock_irqsave() pattern)

The bug was in (a) and (c), but (b) is fine.

Side note: other data structures - not ucounts - can have slightly
more complex behavior, and in particular you can do the find/put
operations without locking if you

 - use RCU to free it

 - do the "find" in a RCU read-locked section

 - after finding it, you use "get_ref_not_zero()" to do an atomic "did
somebody free the last ref, if not increment it"

and the above pattern allows you to do all of a-c without any actual
locking, just local atomics.

That's what all the really critical kernel data structures tend to do.

(And the *really* critical data structures with complex topology - ie
dentries - have a local lock too, and use the lockref data structure,
but that's basically just dentries and the gfs2 gl/qd_lock - and I
have a sneaking suspicion that the gfs2 use is more of a "because I
can" rather than a "because I need to")

                 Linus