lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46538AEE.4030700@linux-foundation.org>
Date:	Tue, 22 May 2007 17:29:34 -0700
From:	Stephen Hemminger <shemminger@...ux-foundation.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	Mike Houston <mikeserv@...s.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.22-rc2

Linus Torvalds wrote:
> On Tue, 22 May 2007, Mike Houston wrote:
>   
>> In this case I actually had the kernel crash. First time for me ever
>> having a kernel oops! System locked up with keyboard LED's blinking.
>>
>> Not sure if anyone wants to see all of it (maybe some screwy
>> userland stuff involved), so I won't include that mess in the
>> message. It's here:
>> http://www.mikeserv.org/files/kernelcrash.txt
>>     
>
> I think you have major memory corruption. That first oops disassembles to
>
> 		mov    0x10(%eax),%esi
> 		mov    $0xfffffdfd,%eax
> 		test   %esi,%esi
> 		je     after_call
> 		mov    %edx,%ecx
> 		mov    %edi,%eax
> 		mov    %ebx,%edx
> 		call   *%esi
> 	after_call:
>
> which is (from net/ipv4/af_inet.c, inet_ioctl()):
>
>                 default:
>                         if (sk->sk_prot->ioctl)
>                                 err = sk->sk_prot->ioctl(sk, cmd, arg);
>                         else
>                                 err = -ENOIOCTLCMD;
>                         break;
>
> and the load off "sk->sk_prot->ioctl" oopses, because "sk->sk_prot" is 
> corrupt and contains 0x8e3cad42, which is not a valid kernel pointer.
>
> The other oops is even worse. 
>
> I also think it meshes with
>
> 	sky2 eth0: descriptor error q=0x280 get=285 [800042375e2e5e] put=285
>
>   
Descriptor error means,  the driver told it to do something but the 
OWNER bit wasn't set.
Only ever saw this on the Gigabyte motherboard.

It looks like the chip reads the wrong memory sometimes. The problem 
happens only on the on-board NIC's
and only on this kind of motherboard.  For testing, I have put code in 
to check that the receive data actually
arrived before the IRQ, it triggered on my Gigabyte 925 motherboard. It 
appears that DMA access
is messed up. This board has lots of "overclocker" friendly stuff; maybe 
the BIOS never really sets up the PCI
bridges and clocks properly.

It doesn't seem like a software or driver problem. I have tried tweaking 
PCI registers but nothing worked
in this case.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ