lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 29 Apr 2011 21:00:24 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	werner <w.landgraf@...ru>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: 2.6.39-rc5-git2 boot crashs

On Fri, Apr 29, 2011 at 8:39 PM, werner <w.landgraf@...ru> wrote:
>
> At my reclamation thread about 2.6.39-rc3,4 crashs, I informed that there
> was a reset-resistent change of the system after crashs, so that on
> subsequent boots (after a 'primary' crash rather at the end of booting) it
> happened an early 'secondary'  crash at the time of initializing ata0, with
> funny effects like that the grafic card (or anything else) was identified as
> an ata device, with subsequent 'read erros' on it and crash. This
> 'secondary' effect repeated and repeated and gone away only at booting with
> a normal kernel (2.6.38.4 or 2.6.26.2). But if afterwards booting again with
> 2.6.39-rc3 or -rc4 , then at the end of the boot it crashed, and at
> subsequent boots again continued this reset-resistent effect that it crasha
> again and again with ata0 problems, until I reboot with 2.6.38.4 or 2.6.26.2
> , or waiting 5 minutes (perhaps until the memory discharged).
>
> All these problems dont happen with 2.6.38.4 or 2.6.26.2

Do you think you could bisect when that odd after-reset behavior started?

It does sound like you have some PCI-level problem (some device that
has "sticky" state and doesn't get reset properly). Most likely a
hardware "feature" (there is various PCI hardware that allows things
like device identifiers to be written to), coupled with a firmware bug
that doesn't reset things.

But it would be intriguing to hear when it started happening, so that
we can figure out exactly _what_ isn't getting properly reset..

The logfs oops may just be a result of "autodetect any random
filesystem" in that confused state. So when the state isn't confused,
you'd not see the oops, because nothing ever tries to mount the
invalid logfs image.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ