lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080411103428.GA15481@wotan.suse.de>
Date:	Fri, 11 Apr 2008 12:34:28 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Pekka Enberg <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org, Christoph Lameter <clameter@....com>,
	Mel Gorman <mel@....ul.ie>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Yinghai.Lu@....com
Subject: Re: [bug] mm/slab.c boot crash in -git, "kernel BUG at mm/slab.c:2103!"

On Fri, Apr 11, 2008 at 11:24:52AM +0200, Ingo Molnar wrote:
> 
> * Pekka Enberg <penberg@...helsinki.fi> wrote:
> 
> > On Fri, Apr 11, 2008 at 12:05 PM, Pekka Enberg <penberg@...helsinki.fi> wrote:
> > >  >  Right. Then you probably want to look into any changes in arch/x86/
> > >  >  related to setting up the zonelists. I'm fairly certain this is not a
> > >  >  slab bug and I don't see any recent changes to the page allocator
> > >  >  either that would explain this.
> > >
> > >  I'd be willing to put some money on this:
> > >
> > >  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b7ad149d62ffffaccb9f565dfe7e5bae739d6836
> > 
> > And I'd lose as you're 32-bit. Oh well, that's the price to pay for 
> > pretending to know x86 arch internals.
> 
> yeah, sorry - we are working hard to unify generic bits like that, but 
> it's a huge architecture.

BTW. I think I'm seeing some problems perhaps related to change page
attr stuff for DEBUG_PAGEALLOC on x86-64. And I don't know if it is the
same thing, but some general instability around either the page allocator
or slab allocator.

The debug pagealloc problems seem to be that a thread suddenly get stuck
in the kernel spinning in cpa (usually on one of the locks) and never
seems to recover. Once it seemed to be spinning in clear_page_... too,
but perhaps could it be messing up the page attributes and running so
slowly that it just appears to be hanging? I'll try to get more info here
but it is hard to reproduce.

The general instability -- I've just seen an oops or two in the page
allocation path in slub recently. Nothing reportable because I've been
running my own patches and/or been unable to reproduce... but it is a bit
unusual and I'll keep an eye out.

Anyway, I'd suggest cooking this kernel a bit longer before release...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ