[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <200610121753.23220.dj@david-web.co.uk>
Date:	Thu, 12 Oct 2006 17:53:22 +0100
From:	David Johnson <dj@...id-web.co.uk>
To:	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Hardware bug or kernel bug?
Hi,
I'm having a major problem on a system that I've been unable to track down. 
When using scp to transfer a large file (a few gig) over the network 
(@100Mbit/s) the system will reboot after about 5-10 minutes of transfer. No 
errors, just a reboot. I have another identical system which exhibits the 
same behaviour.
The system is a Supermicro P4SCT+ with a hyperthreading P4. I've posted the 
dmesg here:
http://www.david-web.co.uk/download/dmesg
I initially tried a different NIC in case that was at fault, but the results 
were the same.
Changing the interrupt timer frequency in the kernel makes a difference:
100Hz - system reboots instantly when transfer is started
250Hz - reboots after a few seconds
1000Hz - reboots after 5-10 minutes
As the problem appears to be interrupt-related, I disabled the I/O APIC in the 
BIOS (after first having to disable hyperthreading) which resulted in the 
system lasting a bit longer before it reboots. I then tried disabling the 
Local APIC as well but this made no difference.
I've tested with Centos' 2.6.9 kernel and with a vanilla 2.6.17.13 kernel and 
the results are the same with both.
Does anyone have any idea whether this is likely to be a hardware problem or a 
kernel problem?
Any suggestions for more ways to debug this would be greatfully received.
Thanks,
David.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
