[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgaaFFoc12yqoQVcQEHN4rYTxTzd7uKgFG6Y0XzJbxpAA@mail.gmail.com>
Date: Tue, 20 Feb 2024 14:02:19 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Guenter Roeck <linux@...ck-us.net>
Cc: Matthew Auld <matthew.auld@...el.com>,
Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@....com>,
Christian König <christian.koenig@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 6.8-rc5
On Tue, 20 Feb 2024 at 13:48, Guenter Roeck <linux@...ck-us.net> wrote:
>
> Turns out it wasn't this code, but
>
> > Now, the __moddi3() is a *bit* more reasonable, because I assume it comes from
> >
> > int slot = i % 3;
>
> this code.
Yeah. It's still the kernel doing silly things for no good reason, but
a compiler can certainly do a small-constant 64-bit unsigned division
without actually going to the expense of actually doing a full divide.
For example, in this case, because 1**32 mod 3 is 1, you can literally
just add the high bits together with the low bits (with carry), and do
a 32-bit modulus.
And in fact, you can then turn that 32-bit modulus into a multiply
instead, avoiding doing any expensive divide at all.
And gcc knows to do all this. I *suspect* that the failing
architectures end up not having a 32x32->64 multiply, or maybe they
just don't have a very good machine description, and that's why gcc
failed on them and just ended up doing the stupid thing.
Regardless, our kernel code was just not good. It should be fixed now.
Linus
Powered by blists - more mailing lists