linux-kernel - Re: [PATCH] block: switch to atomic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wh0RhnMfZG6xQJ=yHTgmPTaxjQOo1Q2=r+_ZR56yiRi4A@mail.gmail.com>
Date:   Mon, 6 Dec 2021 13:17:17 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Kees Cook <keescook@...omium.org>
Cc:     Jens Axboe <axboe@...nel.dk>,
        Peter Zijlstra <peterz@...radead.org>,
        Christoph Hellwig <hch@...radead.org>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] block: switch to atomic_t for request references

On Mon, Dec 6, 2021 at 12:51 PM Kees Cook <keescook@...omium.org> wrote:
>
> I'd like core code to be safe too, though. :)

I've told you before, and I'll tell you again: 'refcount_t' is not "safe".

refcount_t is pure garbage, and is HORRID.

The saturating arithmetic is a fundamental and fatal flaw. It is pure
and utter crap.

It means that you *cannot* recover from mistakes, and the refcount
code is broken.

Stop calling it "safe" or arguing that it protects against anything at all.

It is literally and objectively WORSE than what the page counting code
does using atomics.

I don't know why you cannot seem to admit to that. refcount_t is
misdesigned in a very fundamental way.

It generates horrible code, but it also generates actively BROKEN
code. Saturation is not the answer. Saturation is the question, and
the answer is "No".

And no, anybody who thinks that saturation is "safe" is just lying to
themselves and others.

The most realistic main worry for refcounting tends to be underflow,
and the whole recount underflow situation is no better than an atomic
with a warning.

Because by the time the underflow is detected, it's already too late -
the "decrement to zero" was what resulted in the data being free'd -
so the whole "decrement past zero" is when things have already gone
south earlier, and all you can do is warn.

End result: atomics with warnings are no worse in the underflow case.

And the overflow case where the saturation is 'safe", has been
literally mis-designed to not be recoverable, so refcount_t is
literally broken.

End result: atomics are _better_ in the overflow case, and it's why
the page counters could not use the garbage that is refcount_t, and
instead did it properly.

See? In absolutely neither case is recount_t "safer". It's only worse.

I like Jens' patches. They take the _good_ code - the code we use for
page counters - and make that proper interface available to others.

Not the broken refcount_t code that is unfixable, and only good for
"we have a thousand drivers, let them wallow in this thing because we
can never make them all handle overflows properly".

And every single core use of refcount_t that doesn't have a million
random drivers using it should be converted to use that proper atomic
interface.

                    Linus