[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXF4biAW+FpxnkuTes5uK_G2h_pDj3kN-SMFCmLXqNCcug@mail.gmail.com>
Date: Mon, 4 Mar 2024 14:55:21 +0100
From: Ard Biesheuvel <ardb@...nel.org>
To: Arnd Bergmann <arnd@...db.de>
Cc: Andre Przywara <andre.przywara@....com>, Naresh Kamboju <naresh.kamboju@...aro.org>,
open list <linux-kernel@...r.kernel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>, linux-sunxi@...ts.linux.dev,
dri-devel@...ts.freedesktop.org, lkft-triage@...ts.linaro.org,
Maxime Ripard <mripard@...nel.org>, Dave Airlie <airlied@...hat.com>,
Dan Carpenter <dan.carpenter@...aro.org>
Subject: Re: arm: ERROR: modpost: "__aeabi_uldivmod" [drivers/gpu/drm/sun4i/sun4i-drm-hdmi.ko]
undefined!
On Mon, 4 Mar 2024 at 14:49, Arnd Bergmann <arnd@...db.de> wrote:
>
> On Mon, Mar 4, 2024, at 14:01, Ard Biesheuvel wrote:
> > On Mon, 4 Mar 2024 at 13:35, Arnd Bergmann <arnd@...db.de> wrote:
> >> On Mon, Mar 4, 2024, at 12:45, Andre Przywara wrote:
> >> It's not critical if this is called infrequently, and as Maxime
> >> just replied, the 64-bit division is in fact required here.
> >> Since we are dividing by a constant value (200), there is a good
> >> chance that this will be get turned into fairly efficient
> >> multiply/shift code.
> >>
> >
> > Clang does not implement that optimization for 64-bit division. That
> > is how we ended up with this error in the first place.
>
> I meant it will use the optimization after the patch to convert
> the plain '/' to div_u64().
>
Ah ok.
I did not realize we implement the same optimization in our code as
the one that GCC will apply when encountering a compile-time constant
divisor.
> > Perhaps it is worthwhile to make div_u64() check its divisor, e.g.,
> >
> > --- a/include/linux/math64.h
> > +++ b/include/linux/math64.h
> > @@ -127,6 +127,9 @@
> > static inline u64 div_u64(u64 dividend, u32 divisor)
> > {
> > u32 remainder;
> > +
> > + if (IS_ENABLED(CONFIG_CC_IS_GCC) && __builtin_constant_p(divisor))
> > + return dividend / divisor;
> > return div_u64_rem(dividend, divisor, &remainder);
> > }
>
> I think the div_u64()->do_div()->__div64_const32()->__arch_xprod_64()
> optimization in asm-generic/div64.h already produces what we want
> on both compilers. Is there something missing there?
>
No, you are right. I thought we were relying on GCC for the optimization here.
Powered by blists - more mailing lists