linux-kernel - Re: libata reset-seq merge broke sata

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <464439C8.7090309@gmail.com>
Date:	Fri, 11 May 2007 11:39:20 +0200
From:	Tejun Heo <htejun@...il.com>
To:	Paul Mundt <lethal@...ux-sh.org>, Tejun Heo <htejun@...il.com>,
	jeff@...zik.org, linux-ide@...r.kernel.org,
	linux-kernel@...r.kernel.org
CC:	garyhade@...ibm.com
Subject: Re: libata reset-seq merge broke sata_sil on sh

[cc'ing Gary Hade for GoVault drive, Hello]

Hello, Paul.

Paul Mundt wrote:
> On Thu, May 10, 2007 at 03:08:59PM +0200, Tejun Heo wrote:
>> Paul Mundt wrote:
>>> The detection is simply flaky after that point, however before the
>>> current master it never hit the 35 second point (and thus never implied
>>> that the link was down). I'll double check the bisect log to see if there
>>> was anything beyond that that may have caused it.
>>>
>>> The -ENODEV at least implies that the SRST fails, so at least that's a
>>> starting point.
>> If prereset() fails to get the initial DRDY before 10secs, it assumes
>> something went wrong and escalates to hardreset.  sil family of
>> controllers report 0xff status while the link is broken and it seems
>> that your particular drive needs more than the current 150ms to recover
>> phy link.  It probably went unnoticed till now because the device was
>> never hardreset before.  If the diagnosis is correct, increasing the
>> delay in hardreset should fix the problem.  Well, let's see.  :-)
>>
> Bumping the hardreset delay up does indeed fix it, I've had to bump it up
> to 1200 before it started working (at 600 it still fails):
> 
> [    0.967379] scsi0 : sata_sil
> [    0.970425] scsi1 : sata_sil
> [    0.973298] ata1: SATA max UDMA/100 cmd 0xfd000280 ctl 0xfd00028a bmdma 0xfd000200 irq 0
> [    0.981331] ata2: SATA max UDMA/100 cmd 0xfd0002c0 ctl 0xfd0002ca bmdma 0xfd000208 irq 0
> [    1.299353] ata1: device not ready (errno=-19), forcing hardreset
> [    2.817893] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> [    2.826284] ata1.00: ata_hpa_resize 1: sectors = 39070080, hpa_sectors = 39070080
> [    2.831052] ata1.00: ATA-5: HHD424020F7SV00, 00MLA0A5, max UDMA/100
> [    2.837548] ata1.00: 39070080 sectors, multi 0: LBA
> [    2.842702] ata1.00: applying bridge limits
> [    2.854162] ata1.00: ata_hpa_resize 1: sectors = 39070080, hpa_sectors = 39070080
> [    2.858938] ata1.00: configured for UDMA/100
> [    3.172602] ata2: SATA link down (SStatus 0 SControl 310)
> [    3.175736] scsi 0:0:0:0: Direct-Access     ATA      HHD424020F7SV00  00ML PQ: 0 ANSI: 5
> 
> I'm not sure if it matters or not, but this is an iVDR drive, so that
> might also have additional implications.

Don't have the flimsiest idea what an iVDR drive is but I take it that
it's some sort of special purpose thing.  :-)

So, the HHD424020F7SV00 is the second device which isn't happy with
150ms wait after reset.  The first one was Quantum GoVault.  It's a bit
different in that the HHD only fails after hardreset.

Anyways, so, it seems we'll have to extent the 150ms wait.  I'll try to
cook up a patch for that && parallel probing.

Gary, IIRC, the requirement for GoVault was 3secs, right?  Paul, can you
try to estimate the minimum required delay?  Please go down by 100ms and
report where it breaks.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/