[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b5c4acc-f184-4ad9-9029-dd7967fe4a04@paulmck-laptop>
Date: Wed, 24 Jan 2024 04:12:23 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Jiri Wiesner <jwiesner@...e.de>
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux Next Mailing List <linux-next@...r.kernel.org>
Subject: Re: linux-next: build failure after merge of the rcu tree
On Wed, Jan 24, 2024 at 10:49:54AM +0100, Jiri Wiesner wrote:
> On Wed, Jan 24, 2024 at 03:17:43PM +1100, Stephen Rothwell wrote:
> > After merging the rcu tree, today's linux-next build (i386 defconfig)
> > failed like this:
> > In file included from include/linux/dev_printk.h:14,
> > from include/linux/device.h:15,
> > from kernel/time/clocksource.c:10:
> > kernel/time/clocksource.c: In function 'clocksource_watchdog':
> > kernel/time/clocksource.c:103:34: error: integer overflow in expression of type 'long int' results in '-1619276800' [-Werror=overflow]
> > 103 | * NSEC_PER_SEC / HZ)
> > | ^
> > Caused by commit
> > 1a4545025600 ("clocksource: Skip watchdog check for large watchdog intervals")
> > I have used the rcu tree from next-20240123 for today.
>
> This particular patch is still beging discussed on the LKML. This is the
> latest submission with improved variable naming, increased threshold and
> changes to the log and the warning message (as proposed by tglx):
> https://lore.kernel.org/lkml/20240122172350.GA740@incl/
> Especially the change to the message is important. I think this message
> will be commonplace on 8 NUMA node (and larger) machines. If there is
> anything else I can do to assist please let me know.
Here is the offending #define:
#define WATCHDOG_INTR_MAX_NS ((WATCHDOG_INTERVAL + (WATCHDOG_INTERVAL >> 1))\
* NSEC_PER_SEC / HZ)
The problem is that these things are int or long, and on i386, that
is only 32 bits. NSEC_PER_SEC is one billion, and WATCHDOG_INTERVAL
is often 1000, which overflows. The division by HZ gets this back in
range at about 1.5x10^9.
So this computation must be done in 64 bits even on 32-bit systems.
My thought would be a cast to u64, then back to long for the result.
Whatever approach, Jiri, would you like to send an updated patch?
In the meantime, I will rebase to exclude this one from -next.
Thanx, Paul
Powered by blists - more mailing lists