lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTinCKC23WfG7jnd-ihW9kXz-6SEgFQ@mail.gmail.com>
Date:	Tue, 3 May 2011 08:22:49 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	werner <w.landgraf@...ru>, Ingo Molnar <mingo@...e.hu>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: 2.6.39-rc5-git2 boot crashs

2011/5/3 werner <w.landgraf@...ru>:
> Pls watch the config enclosed.
>
> IDE on , X86_EXTENDED_PLATFORM off (also X_86 elan)
>
> From the previous two suggestions, MTD on (appearently don't makes
> problems), but of MISC-FILESYSTEMS what appearently causes the error message
> during boot and perhaps also that sync don't work, I switched on the half
> and off the other half, to circle the problem.

Ok, can you try the attached patch, to see if the logfs oops goes
away. Perhaps  more importantly, does the sync problem also go away?

> No problem with unzip / zip / moving big files etc, , so that this problem
> cames from X86_EXTENDED_PLATTFORM.

Ok, that is very interesting.

> Tell me what to try out now

So at this point you have two problems, and I really would like to
just doubly verify both of them. First off, the attached patch for the
logfs oops and (hopefully) the sync hanging issue.

But secondly, I want you to double--check that whole CONFIG_X86_ELAN
thing - I'd like you to test two kernels that are otherwise totally
identical in their configurations, except one has
CONFIG_X86_EXTENDED_PLATTFORM on and CONFIG_X86_ELAN, and the other
does not.  Just to make sure that with all the changes to the config
file, that is really the _only_ difference, and that yes, that's the
one that brings up the "crash at unzip" problem.

I'm adding Ingo Molnar, Thomas Gleixner Peter Anvin to the cc, because
if this whole problem really is because of the x86 CPU configuration,
they may have better ideas than I do.

Ingo/Thomas/Peter: see the whole long and confused thread on lkml. But
it all boils down to Werner using a very full kernel config where not
only is almost everything compiled in (which showed the logfs problem
even though Werner didn't even have a logfs filesystem), but he also
had a very generic x86 kernel. Too generic.

He had CONFIG_X86_EXTENDED_PLATTFORM and CONFIG_X86_ELAN on, and that
has apparently worked for him (and a lot of other people - he does a
distribution) up until 2.6.38. But as of 2.6.39-rc1 it causes some
really odd problems under IO (his test-case is "unzip", but that's
probably fairly random). The problem seems to show up as a bogus IO
list for SATA, causing a big WARN_ON() or oops and then a dead machine
due to IO problems.

I wonder what CONFIG_X86_ELAN has to do with anything, but from all
the config testing werner has done, it really looks like that's the
smoking gun here.

Why does M686 work, but X86_ELAN causes odd problems in 2.6.39-rc?
Allocator issues? Maybe related to the lockless slub paths?

So I obviously agree that X86_ELAN is a crazy choice for a generic
kernel, but it _used_ to work, and this is a regression.

                       Linus

View attachment "patch.diff" of type "text/x-patch" (888 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ