lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <45DBEED0.6050703@gmail.com>
Date:	Wed, 21 Feb 2007 16:03:44 +0900
From:	Tejun Heo <htejun@...il.com>
To:	Jeff Garzik <jeff@...zik.org>
CC:	Mark Lord <lkml@....ca>, auxsvr@...il.com,
	linux-kernel@...r.kernel.org
Subject: Re: ata command timeout

Jeff Garzik wrote:
> Mark Lord wrote:
>> I don't believe that.  Command timeouts never happen on healthy systems,
>> unless we have a driver bug.  Okay, so I can imagine a pathological case
>> of a full queue (NCQ) with all 32 commands taking longer than usual due
>> to ECC retries in the firmware..
> 
> It's not quite so black and white.  There have definitely been interrupt
> delivery problems that cause command timeouts.  Also, Intel PIIX BMDMA
> (all standard PCI IDE, I think?) is defined to /not/ send an interrupt,
> when a DMA error occurs.  The driver is instructed to time out the
> transaction, and start recovery by deducing the state of things from the
> DMA status bits.
> 
> Nonetheless, I mostly agree with your statement.  The two most common
> causes of timeouts that I see are interrupt delivery problems, and
> driver bugs.

Oh.. well.  My experience is that it's much more common on SATA compared
to PATA.  SATA link seems to be one of the most vulnerable parts to
interference.  When PSU has the slightest of problem, SATA drives
timeout or give transmission problems.  System often survives brief
fluctuation in power input (e.g. when the compressor starts up) but SATA
link sometimes reports error after such event.

Or just buy a static generator and apply it to your computer case.
Generally system is perfectly okay with that but the SATA devices tend
to complain or timeout.

Those condition might not be considered too healthy in any server
environment but they do occur on cheap desktop environment.  I mean, a
lot of people are putting 10USD PSU into their desktop machines.

So, yeah, it might be a driver or other problem but if problem is very
intermittent, I tend to lean toward transient hardware problem and
that's primarily why I wanna make EH kick in and recover faster.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ