[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wj_4xsWxLqPvkCV6eOJt7quXS8DyXn3zWw3W94wN=6yig@mail.gmail.com>
Date: Fri, 23 Dec 2022 10:44:07 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Yury Norov <yury.norov@...il.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
Tariq Toukan <tariqt@...dia.com>,
Valentin Schneider <vschneid@...hat.com>,
linux-rdma@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [GIT PULL] bitmap changes for v6.2-rc1
On Thu, Dec 15, 2022 at 3:11 PM Yury Norov <yury.norov@...il.com> wrote:
>
> Please pull bitmap patches for v6.2. They spent in -next for more than
> a week without any issues. The branch consists of:
So I've been holding off on this because these bitmap pulls have
always scared me, and I wanted to have the time to actually look
through them in detail before pulling.
I'm back home, over the travel chaos, and while I have other pulls
pending, they seem to be benign fixes so I started looking at this.
And when looking at it, I did indeed finx what I think is a
fundamental arithmetic bug.
That small_const_nbits_off() is simply buggy.
Try this:
small_const_nbits_off(64,-1);
and see it return true.
The thing is, doing divides in C is something you need to be very
careful about, because they do *not* behave nicely with signed values.
They do the "mathematically intuitive" thing of truncating towards
zero, but when you are talking about bit ranges like this, that is
*NOT* what you want.
The comment is fairly good, but it just doesn't match the code.
If you want to check that 'nbits-1' and 'off' are in the same word,
you simply should not use division at all. Sure, it would work, if you
verify that they are both non-negative, but the fact is, even then
it's just wrong, wrong, wrong. You want a shift operation or a masking
operation.
Honestly, in this case, I think the logical thing to do is "check that
the upper bits are the same". The way you do that is probably
something like
!((off) ^ ((nbits)-1) & ~(BITS_PER_LONG-1))
which doesn't have the fundamental bug that the divide version has.
So anyway, I won't be pulling this. I appreciate that you are trying
to micro-optimize the bitop functions, but this is not the first time
your pull requests have made me worried and caused me to not pull.
This is such basic functionality that I really need these pull
requests to be less worrisome.
In other words, not only don't I want to see these kinds of bugs -
please make sure that there isn't anythign that looks even *remotely*
scary. The micro-optimizations had better be *so* obvious and so well
thought through that I not only don't find bugs in them, I don't even
get nervous about finding bugs.
Side note: that whole small_const_nbits_off() thing could be possibly
improved in other ways. It's actually unnecessarily strict (if it was
correct, that is): we don't really require 'off' to be constant, what
we *do* want is
- "nbits > off" needs to be a compile-time true constant
- that "off is in the same word as nbits-1" also needs to be a
compile-time true constant.
and the compiler could determine both of those to be the case even if
'off' itself isn't a constant, if there is code around it that
verifies it.
For example, if you have code like this:
off = a & 7;
n = find_next_bit(bitmap, 64, off);
then the compiler could still see that it's that "last word only"
case, even though 'off' isn't actually a constant value.
So you could try using a helper macro like
#define COMPILE_TIME_TRUE(x) (__builtin_constant_p(x) && (x))
to handle those kinds of things. But again - the code needs to be
OBVIOUSLY CORRECT. Make me feel those warm and fuzzies about it
because it's so obviously safe and correct.
That said, I see your test-cases for your micro-optimizations, but I'd
also like to see references to where it actually improves real code.
I know for a fact that 'small_const_nbits()' actually triggers in real
life. I don't see that your 'small_const_nbits_off()' would trigger in
real life in any situation that isn't already covered by
'small_const_nbits()'.
So convince me not only that the optimizations are obviously correct,
but also that they actually matter.
Linus
Powered by blists - more mailing lists