linux-kernel - Re: [PATCH 0/3] ipc subsystem refcounter conversions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87a85wvsxa.fsf@xmission.com>
Date:   Mon, 29 May 2017 05:49:53 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Christoph Hellwig <hch@...radead.org>,
        Kees Cook <keescook@...omium.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Elena Reshetova <elena.reshetova@...el.com>,
        Greg KH <gregkh@...uxfoundation.org>,
        Ingo Molnar <mingo@...hat.com>,
        Alexey Dobriyan <adobriyan@...il.com>,
        "Serge E. Hallyn" <serge@...lyn.com>, arozansk@...hat.com,
        Davidlohr Bueso <dave@...olabs.net>,
        Manfred Spraul <manfred@...orfullife.com>,
        "axboe\@kernel.dk" <axboe@...nel.dk>,
        James Bottomley <James.Bottomley@...senpartnership.com>,
        "x86\@kernel.org" <x86@...nel.org>, Ingo Molnar <mingo@...nel.org>,
        Arnd Bergmann <arnd@...db.de>,
        "David S. Miller" <davem@...emloft.net>,
        Rik van Riel <riel@...hat.com>,
        linux-arch <linux-arch@...r.kernel.org>,
        "kernel-hardening\@lists.openwall.com" 
        <kernel-hardening@...ts.openwall.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/3] ipc subsystem refcounter conversions

Peter Zijlstra <peterz@...radead.org> writes:

> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>
>> Kees I I have a concern:
>> 
>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>> {
>>         unsigned int new, val = atomic_read(&r->refs);
>> 
>>         do {
>>                 if (!val)
>>                         return false;
>> 
>>                 if (unlikely(val == UINT_MAX))
>>                         return true;
>> 
>>                 new = val + i;
>>                 if (new < val)
>>                         new = UINT_MAX;
>> 
>>         } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>> 
>>         WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>> 
>>         return true;
>> }
>> 
>> Why in the world do you succeed when you the value saturates????
>
> Why not? On saturation the object will leak and returning a reference to
> it is always good.
>
>> From a code perspective that is bizarre.   The code already has to handle
>> the case when the counter does not increment.
>
> I don't see it as bizarre, we turned an overflow/use-after-free into a
> leak. That's the primary mechanism here.
>
> As long as we have a reference to a leaked object, we might as well use
> it, its not going anywhere.
>
>> Fixing the return value would move refcount_t into the realm of
>> something that is desirable because it has bettern semantics and
>> is more useful just on a day to day correctness point of view.  Even
>> ignoring the security implications.
>
> It changes the semantics between inc_not_zero() and inc(). It also
> complicates the semantics of inc_not_zero(), where currently the failure
> implies the count is 0 and means no-such-object, you complicate matters
> by basically returning 'busy'.

Busy is not a state of a reference count.

It is true I am suggesting treating something with a saturated reference
as not available.  If that is what you mean by busy.  But if it's
reference is zero it is also not available.  So there is no practical
difference.

> That is a completely new class of failure that is actually hard to deal
> with, not to mention that it completely destroys refcount_inc_not_zero()
> being a 'simple' replacement for atomic_inc_not_zero().
>
> In case of the current failure, the no-such-object, we can fix that by
> creating said object. But what to do on 'busy' ? Surely you don't want
> to create another. You'd have to somehow retrofit something to wait on
> in every user.

Using little words.

A return of true from inc_not_zero means we took a reference.
A return of false means we did not take a reference.

The code already handles I took a reference or I did not take a
reference.

Therefore lying with refcount_t is not helpful.  It takes failures
the code could easily handle and turns them into leaks.

At least that is how I have seen reference counts used.  And those
are definitely the plane obivous semantics.

Your changes are definitely not drop in replacements for atomic_t in my
code.

Eric