[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wh75aOWvaEhYsAcCJSRzhy8MO_SdvDzB6zmvVs+FeAr6w@mail.gmail.com>
Date: Fri, 28 Oct 2022 12:01:00 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Yury Norov <yury.norov@...il.com>,
Catalin Marinas <catalin.marinas@....com>,
Mark Rutland <mark.rutland@....com>,
Will Deacon <will@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 1/5] ARM: findbit: document ARMv5 bit offset calculation
On Fri, Oct 28, 2022 at 10:45 AM Russell King (Oracle)
<linux@...linux.org.uk> wrote:
>
> For the _find_first_bit, there isn't much difference in the number
> of instructions or really what is going on, only the organisation
> and flow of the code is more inline - but that shouldn't make much
> of a difference. Yet, there is a definite repeatable measurable
> difference between the two:
Hmm. Interestingly, your _find_first_zero_bit_le() (which
find_next_bit ends up using except for the first byte) ends up doing
an optimization that is technically not valid.
In particular, the *generic* code does
sz = min(idx * BITS_PER_LONG + __ffs(MUNGE(val)), sz);
for the final result.
In contrast, the arm code doesn't do the "min()" at all, and if there
are bits after the bitmap (in a partial byte), it will just return
those bits.
So the arm code ends up avoiding some operations. Which works most of
the time, because
(a) usually bitmaps are byte-aligned anyway
(b) most users do "find_first_bit(.., size) >= size" as the "found no
bits" test
but it actually looks to me like your handcoded arm code is simply
wrong. At least going by our docbook comments for find_first_bit:
* Returns the bit number of the first set bit.
* If no bits are set, returns @size.
And look here: bitmap_empty() does
return find_first_bit(src, nbits) == nbits;
and now imagine that 'nbits' is not a small constant value (which we
handle separately) and is also not byte aligned.
Maybe I'm mis-reading your assembly (I "know" arm assembly, but I
can't read it fluently like x86). But I don't think so.
So I think your code is actually buggy, but probably the bug is quite
hard to trigger in practice due to (a)/(b) above.
We do have bitmaps that aren't byte-aligned. The cpumask ones are the
most common ones. But in the cpumask_first() implementation (which is
just a wrapper for find_first_bit()), our documentation actually says
* Returns >= nr_cpu_ids if no cpus set.
and I think that may have been what we historically did elsewhere too,
and may be the source of the arm situation.
Anyway, this can be fixed by either
(a) fixing the arm code
(b) changing the docs and making that ">= size" be the right thing to do
but I do think this is a very strong example of why having
architecture-specific code like this is very very dangerous. Because
as it stands now, that arm code really looks like it's actively buggy
to me.
Linus
Powered by blists - more mailing lists