[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wh_vEzmYnMufOa=03WAU=DRM5+n6uZy=dVtJERFJm3Q-Q@mail.gmail.com>
Date: Thu, 19 Aug 2021 12:09:37 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Will Deacon <will@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Borislav Petkov <bp@...en8.de>,
Xiyu Yang <xiyuyang19@...an.edu.cn>,
Alistair Popple <apopple@...dia.com>,
Yang Shi <shy828301@...il.com>,
Shakeel Butt <shakeelb@...gle.com>,
Hugh Dickins <hughd@...gle.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>, yuanxzhang@...an.edu.cn,
Xin Tan <tanxin.ctf@...il.com>,
Will Deacon <will.deacon@....com>,
David Howells <dhowells@...hat.com>
Subject: Re: [PATCH] mm/rmap: Convert from atomic_t to refcount_t on anon_vma->refcount
On Thu, Aug 19, 2021 at 8:21 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> If we can skip the OF... we can do something like this:
Honestly, I think a lot of the refcount code is questionable. It was
absolutely written with no care for performance AT ALL.
I'm not sure it helps to then add arch-specific code for it without
thinking it through a _lot_ first.
It might be better to just have a "atomic_t with overflow handling" in
general, exactly because the refcount_t was designed and written
without any regard for code that cares about performance.
> static inline bool refcount_dec_and_test(refcount_t *r)
> {
> asm_volatile_goto (LOCK_PREFIX "decl %[var]\n\t"
> "jz %l[cc_zero]\n\t"
> "jns 1f\n\t"
I think you can use "jl" for the bad case.
You want to fail if the old value was negative, or if the old value
was less than what you subtracted. That's literally the "less than"
operator.
I think it's better to handle that case out-of-line than play games
with UD, though - this is going to be the rare case, the likelihood
that we get the fixup wrong is just too big. Once it's out-of-line
it's not as critical any more, even if it does add to the size of the
code.
So if we do this, I think it should be something like
static inline __must_check bool refcount_dec_and_test(refcount_t *r)
{
asm_volatile_goto (LOCK_PREFIX "decl %[var]\n\t"
"jz %l[cc_zero]\n\t"
"jl %l[cc_error]"
: : [var] "m" (r->refs.counter)
: "memory" : cc_zero, cc_error);
return false;
cc_zero:
return true;
cc_error:
refcount_warn_saturate(r, REFCOUNT_SUB_UAF);
return false;
}
and we can discuss whether we could improve on the
refcount_warn_saturate() separately.
But see above: maybe just make this a separate "careful atomic_t",
with the option to panic-on-overflow. So then we could get rid of
refcount_warn_saturate() enmtirely above, and instead just have a
(compile-time option) BUG() case, with the non-careful version just
being our existing atomic_dec_and_test.
Giving people who want a smaller kernel the option to do so, while
giving people who want checking the option too.
Linus
Powered by blists - more mailing lists