lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 24 Jan 2024 14:31:05 +0100
From: Jiri Wiesner <jwiesner@...e.de>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linux Next Mailing List <linux-next@...r.kernel.org>
Subject: Re: linux-next: build failure after merge of the rcu tree

On Wed, Jan 24, 2024 at 04:12:23AM -0800, Paul E. McKenney wrote:
> On Wed, Jan 24, 2024 at 10:49:54AM +0100, Jiri Wiesner wrote:
> > On Wed, Jan 24, 2024 at 03:17:43PM +1100, Stephen Rothwell wrote:
> > > After merging the rcu tree, today's linux-next build (i386 defconfig)
> > > failed like this:
> > > In file included from include/linux/dev_printk.h:14,
> > >                  from include/linux/device.h:15,
> > >                  from kernel/time/clocksource.c:10:
> > > kernel/time/clocksource.c: In function 'clocksource_watchdog':
> > > kernel/time/clocksource.c:103:34: error: integer overflow in expression of type 'long int' results in '-1619276800' [-Werror=overflow]
> > >   103 |                                  * NSEC_PER_SEC / HZ)
> > >       |                                  ^
> > > Caused by commit
> > >   1a4545025600 ("clocksource: Skip watchdog check for large watchdog intervals")
> > > I have used the rcu tree from next-20240123 for today.
> > 
> > This particular patch is still beging discussed on the LKML. This is the 
> > latest submission with improved variable naming, increased threshold and 
> > changes to the log and the warning message (as proposed by tglx):
> > https://lore.kernel.org/lkml/20240122172350.GA740@incl/
> > Especially the change to the message is important. I think this message 
> > will be commonplace on 8 NUMA node (and larger) machines. If there is 
> > anything else I can do to assist please let me know.
> 
> Here is the offending #define:
> 
> #define WATCHDOG_INTR_MAX_NS	((WATCHDOG_INTERVAL + (WATCHDOG_INTERVAL >> 1))\
> 				 * NSEC_PER_SEC / HZ)
> 
> The problem is that these things are int or long, and on i386, that
> is only 32 bits.  NSEC_PER_SEC is one billion, and WATCHDOG_INTERVAL
> is often 1000, which overflows.  The division by HZ gets this back in
> range at about 1.5x10^9.

Exactly.

> So this computation must be done in 64 bits even on 32-bit systems.
> My thought would be a cast to u64, then back to long for the result.

This will be a more precise solution than enclosing NSEC_PER_SEC / HZ in 
brackets, which I chose to do in the v2 of this patch.

> Whatever approach, Jiri, would you like to send an updated patch?

Yes, I can incorporate the casting to u64 and back to long into the patch. 
At this point, I am not sure which version to use. There are:
* v1 (submitted to the LKML on Jan 3rd): the patch that got merged into linux-next
* v2 (submitted to the LKML on Jan 10th): that has an alternative fix for the interger overflow
* v3 (submitted to the LKML on Jan 22nd): that incoporates suggestions by Thomas Gleixner

I could update the v3 of this patch with casting to u64 and back to long. 
WATCHDOG_INTERVAL_MAX_NS got set to 2 * WATCHDOG_INTERVAL in v3 - a change 
I do not entirely agree with. I think WATCHDOG_INTERVAL_MAX_NS should be 
kept narrow so as not to impose a limit on time skew that is too strict 
for readout intervals approaching 2 * WATCHDOG_INTERVAL in their length. 
The question is what is too strict.
-- 
Jiri Wiesner
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ