lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 16 Dec 2011 17:14:30 -0600
From:	Shawn Bohrer <shawn.bohrer@...il.com>
To:	Urban Loesch <bind@...s.net>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: divide by zero error: find busiest group on kernel 2.6.38.4

On Sun, Dec 04, 2011 at 03:00:00PM +0100, Urban Loesch wrote:
> I'm running a DELL PE R610 with kernel
> 2.6.38.4 patched with linux vserver version vs2.3.0.37-rc15 from
> http://linux-vserver.org.
> 
> The server runs fine about 220 days without any problems.
> But last night there was a kernel panic and the server totally hangs.
> 
> Thanks to netconsole I got the following error in my syslogserver:
> 
> 
> 2011-12-04 00:32:16 divide error: 0000 [#1]
> 2011-12-04 00:32:16 SMP
<snip>
> 2011-12-04 00:32:16 Pid: 0, comm: kworker/0:1 Not tainted
> 2.6.38.4-vs2.3.0.37-rc15-rol-em64t #1
> 2011-12-04 00:32:16
> 2011-12-04 00:32:16 Dell Inc. PowerEdge R610
> 2011-12-04 00:32:16 /
> 2011-12-04 00:32:16 0F0XJ6
> 2011-12-04 00:32:16
> 2011-12-04 00:32:16 RIP: 0010:[<ffffffff8103abb8>]
> 2011-12-04 00:32:16 [<ffffffff8103abb8>]
> find_busiest_group+0x428/0xdd0

This looks like the same issue as:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=636797
and
https://bugs.launchpad.net/linux/+bug/614853

In theory there is also a bugzilla.kernel.org ticket on this issue as
well though bugzilla.kernel.org is still down.

https://bugzilla.kernel.org/show_bug.cgi?id=16991

Debian and Ubuntu have papered over this bug by skipping the divide if
cpu_power is 0.

> I searched the archives but I didn't find any related information.
> Have you any idea what this error could be and is it fixed in kernel 3.1?

To my knowledge the cause of this bug is still unknown.  It is
possible it is fixed in newer kernels, but it is hard to tell since it
doesn't seem to occur until you have reached 200+ days of uptime.

--
Shawn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ