lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Jun 2007 11:12:37 +0900
From:	Tejun Heo <htejun@...il.com>
To:	Robert Hancock <hancockr@...w.ca>
CC:	Andrew Morton <akpm@...ux-foundation.org>, enricoss@...cali.it,
	linux-kernel@...r.kernel.org, linux-ide@...r.kernel.org,
	Jeff Garzik <jeff@...zik.org>
Subject: Re: hsm violation

Robert Hancock wrote:
> Andrew Morton wrote:
>> On Sun, 24 Jun 2007 14:32:22 +0200 Enrico Sardi <enricoss@...cali.it>
>> wrote:
>>> [   61.176000] ata1.00: exception Emask 0x2 SAct 0x2 SErr 0x0 action
>>> 0x2 frozen
>>> [   61.176000] ata1.00: (spurious completions during NCQ issue=0x0
>>> SAct=0x2 FIS=005040a1:00000004)
>>
>> It's not obvious (to me) whether this is a driver bug, a hardware bug,
>> expected-normal-behaviour or what - those diagnostics (which we get to
>> see distressingly frequently) are pretty obscure.
> 
> The spurious completions during NCQ error is indicating that the drive
> has indicated it's completed NCQ command tags which weren't outstanding.
>  It's normally a result of a bad NCQ implementation on the drive.
> Technically we can live with it, but it's rather dangerous (if it
> indicates completions for non-outstanding commands, how do we know it
> doesn't indicate completions for actually outstanding commands that
> aren't actually completed yet..)

There is a small race window there.  Please consider the following sequence.

1. drive sends SDB FIS with spurious completion in it.
2. block layer issues new r/w command to the drive.  SDB FIS is still in
flight.
3. ata driver issues the command (the pending bit is set prior to
transmitting command FIS).
4. controller completes receiving FIS from #1.  Driver reads the mask
and completes all indicated commands.  If spurious completion in #1
happens to match the slot allocated in #3, the driver just completed a
command which hasn't been issued to the drive yet.

So, it actually is dangerous.  We might even be seeing the real
completion as spurious one (as the command is completed prematurely).

It seems all those HTS541* drives share this problem.  Four of them are
already on the blacklist and the other OS reportedly blacklists three of
them too.  I'll submit a patch to add HTS541616J9SA00.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ