lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 19 Dec 2011 09:02:07 +0100
From:	Urban Loesch <bind@...s.net>
To:	Shawn Bohrer <shawn.bohrer@...il.com>
CC:	linux-kernel@...r.kernel.org
Subject: Re: divide by zero error: find busiest group on kernel 2.6.38.4

Hi,

On 17.12.2011 00:14, Shawn Bohrer wrote:
> On Sun, Dec 04, 2011 at 03:00:00PM +0100, Urban Loesch wrote:
>> I'm running a DELL PE R610 with kernel
>> 2.6.38.4 patched with linux vserver version vs2.3.0.37-rc15 from
>> http://linux-vserver.org.
>>
>> The server runs fine about 220 days without any problems.
>> But last night there was a kernel panic and the server totally hangs.
>>
>> Thanks to netconsole I got the following error in my syslogserver:
>>
>>
>> 2011-12-04 00:32:16 divide error: 0000 [#1]
>> 2011-12-04 00:32:16 SMP
> <snip>
>> 2011-12-04 00:32:16 Pid: 0, comm: kworker/0:1 Not tainted
>> 2.6.38.4-vs2.3.0.37-rc15-rol-em64t #1
>> 2011-12-04 00:32:16
>> 2011-12-04 00:32:16 Dell Inc. PowerEdge R610
>> 2011-12-04 00:32:16 /
>> 2011-12-04 00:32:16 0F0XJ6
>> 2011-12-04 00:32:16
>> 2011-12-04 00:32:16 RIP: 0010:[<ffffffff8103abb8>]
>> 2011-12-04 00:32:16 [<ffffffff8103abb8>]
>> find_busiest_group+0x428/0xdd0
>
> This looks like the same issue as:
>
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=636797
> and
> https://bugs.launchpad.net/linux/+bug/614853
>
> In theory there is also a bugzilla.kernel.org ticket on this issue as
> well though bugzilla.kernel.org is still down.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16991
>
> Debian and Ubuntu have papered over this bug by skipping the divide if
> cpu_power is 0.
>
>> I searched the archives but I didn't find any related information.
>> Have you any idea what this error could be and is it fixed in kernel 3.1?
>
> To my knowledge the cause of this bug is still unknown.  It is
> possible it is fixed in newer kernels, but it is hard to tell since it
> doesn't seem to occur until you have reached 200+ days of uptime.
>

Not sure if that describes exactly the same problem:

http://comments.gmane.org/gmane.linux.kernel/1132515

Patch:
http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commitdiff;h=4cecf6d401a01d054afc1e5f605bcbfe553cb9b9

This issue was fixed in 3.1.5.
http://www.kernel.org/pub/linux/kernel/v3.0/ChangeLog-3.1.5

> --
> Shawn
>

Thanks
Urban
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ