[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGCbRguHwFY372Ut@yury>
Date: Sat, 28 Jun 2025 21:48:02 -0400
From: Yury Norov <yury.norov@...il.com>
To: cp0613@...ux.alibaba.com
Cc: alex@...ti.fr, aou@...s.berkeley.edu, arnd@...db.de,
linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-riscv@...ts.infradead.org, linux@...musvillemoes.dk,
palmer@...belt.com, paul.walmsley@...ive.com
Subject: Re: [PATCH 2/2] bitops: rotate: Add riscv implementation using Zbb
extension
On Sat, Jun 28, 2025 at 07:13:57PM +0800, cp0613@...ux.alibaba.com wrote:
> On Fri, 20 Jun 2025 12:20:47 -0400, yury.norov@...il.com wrote:
>
> > Can you add a comment about what is happening here? Are you sure it's
> > optimized out in case of the 'legacy' alternative?
>
> Thank you for your review. Yes, I referred to the existing variable__fls()
> implementation, which should be fine.
No, it's not fine. Because you trimmed your original email completely,
so there's no way to understand what I'm asking about; and because you
didn't answer my question. So I'll ask again: what exactly you are doing
in the line you've trimmed out?
> > Here you wire ror/rol() to the variable_ror/rol() unconditionally, and
> > that breaks compile-time rotation if the parameter is known at compile
> > time.
> >
> > I believe, generic implementation will allow compiler to handle this
> > case better. Can you do a similar thing to what fls() does in the same
> > file?
>
> I did consider it, but I did not find any toolchain that provides an
> implementation similar to __builtin_ror or __builtin_rol. If there is one,
> please help point it out.
This is the example of the toolchain you're looking for:
/**
* rol64 - rotate a 64-bit value left
* @word: value to rotate
* @shift: bits to roll
*/
static inline __u64 rol64(__u64 word, unsigned int shift)
{
return (word << (shift & 63)) | (word >> ((-shift) & 63));
}
What I'm asking is: please show me that compile-time rol/ror is still
calculated at compile time, i.e. ror64(1234, 12) is evaluated at
compile time.
> In addition, I did not consider it carefully before. If the rotate function
> is to be genericized, all archneed to include <asm-generic/bitops/rotate.h>.
> I missed this step.
Sorry, I'm lost here about what you've considered and what not. I'm OK
about accelerating ror/rol, but I want to make sure that;
1. The most trivial compile-case is actually evaluated at compile time; and
2. Any arch-specific code is well explained; and
3. legacy case optimized just as well as non-legacy.
Thanks,
Yury
Powered by blists - more mailing lists