[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFyiEO8o=1y_FaWjUr5C1JMCoN2h5aMJuPTG=7AexEZLLQ@mail.gmail.com>
Date: Tue, 31 Jan 2017 13:32:38 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Tejun Heo <tj@...nel.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
dougmill@...ux.vnet.ibm.com
Subject: Re: [GIT PULL] percpu fix for v4.10-rc6
On Tue, Jan 31, 2017 at 8:55 AM, Tejun Heo <tj@...nel.org> wrote:
>
> Douglas found and fixed a ref leak bug in percpu_ref_tryget[_live]().
> The bug is caused by storing the return value of
> atomic_long_inc_not_zero() into an int temp variable before returning
> it as a bool. The interim cast to int loses the upper bits and can
> lead to false negatives. As percpu_ref uses a high bit to mark a
> draining counter, this can happen relatively easily. Fixed by using
> bool for the temp variable.
I think this fix is wrong.
The fact is, atomic_long_inc_not_zero() shouldn't be returning
anything with high bits.. Casting to "int" should have absolutely no
impact. "int" is the traditional C truth value (with zero/nonzero
being false/true), and while we're generally moving towards "bool" for
true/false return values, I do think that code that assumes that these
functions can be cast to "int" are right.
For example, we used to have similar bugs in "test_bit()" returning
the actual bit value (which could be high).
And when users hit that problem, we fixed "test_bit()", not the users of it.
So I'd rather fix the places that (insanely) return a 64-bit value.
Is this purely a ppc64 issue, or does it happen somewhere else too?
Basically, the functions
atomic*_sub_and_test()
atomic*_dec_and_test()
atomic*_inc_and_test()
atomic*_add_negative()
atomic*_add_unless()
should all be returning a proper bool/int truth value (preferably 0/1
so that we don't need to worry about it), not some random crazy value.
I've pulled this, but I really think it's papering over the real
issue. Adding "linux-arch" mailing list to ask architecture
maintainers to check their implementation of the atomic ops that
return a truth value.
Linus
Powered by blists - more mailing lists