lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120514104829.GA25923@gmail.com>
Date:	Mon, 14 May 2012 12:48:29 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Robin Holt <holt@....com>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	linux-kernel@...r.kernel.org
Subject: Re: Commit cb83b62 fails to boot with a divide by zero error.


* Robin Holt <holt@....com> wrote:

> On Fri, May 11, 2012 at 05:36:13PM +0200, Peter Zijlstra wrote:
> > On Fri, 2012-05-11 at 10:05 -0500, Robin Holt wrote:
> > > On Fri, May 11, 2012 at 04:33:10PM +0200, Peter Zijlstra wrote:
> > > > On Fri, 2012-05-11 at 08:39 -0500, Robin Holt wrote:
> > > > 
> > > > > We found that reverting the commit:
> > > > > cb83b62 (x86/sched/core) sched/numa: Rewrite the CONFIG_NUMA sched domain support
> > > > > 
> > > > > also got things working.
> > > > 
> > > > there's a particularly stupid bug in that code
> > > 
> > > Even with that applied, I still get the divide by zero.
> > 
> > Humm.. what kind of machine is this? And how far along does it get in
> > booting? ->power isn't supposed to get to 0.
> 
> It is a four blade (8 socket 80 core 160 hyper-thread machine) 
> with 40 GB of RAM.
> 
> Looking at the earlier kernel messages, I am wondering if I 
> don't have a BIOS that is giving me crud.  I have messages 
> about hyperthreads being on different nodes.  That had not 
> been happening in the past.  I don't have access to the 
> machine now, but the BIOS string that had printed out is from 
> a developer's debug version.
> 
> When I get access to the machine again (likely not until 
> Monday), I will flash a release BIOS and retest.  Until then, 
> please feel free to ignore me.

Please don't re-flash the BIOS! We want to fix this bug - the 
kernel should never crash on whatever topology data the BIOS 
passes.

We can sanitize it or ignore it, but crashing is not an option. 
So lets figure this out, ok?

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ