lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45D25CF2.5030508@gmail.com>
Date:	Tue, 13 Feb 2007 16:50:58 -0800
From:	Tejun Heo <htejun@...il.com>
To:	Robert Hancock <hancockr@...w.ca>
CC:	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-ide@...r.kernel.org, edmudama@...il.com,
	Nicolas.Mailhot@...oste.net, Jeff Garzik <jeff@...zik.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Mark Lord <mlord@...ox.com>, Jens Axboe <jens.axboe@...cle.com>
Subject: Re: libata FUA revisited

[cc'ing Jeff, Alan, Mark and Jens.  Hi!]

Hello, Robert.

Robert Hancock wrote:
> Well, we should be able to determine that experimentally (at least on 
> specific controllers) with a little test program that just writes little 
> bits of data and fsyncs repeatedly (assuming that does in fact trigger 
> FUAs currently..) If it runs faster than the drive could possibly be 
> rewriting the physical disk then obviously the FUA bit is not getting 
> through and/or not respected and we can blacklist FUA on that controller.

That's right.

> Also, the FUA bit in the NCQ commands is in the device register, so it's 
> not like the PMP fields where it's not used for anything else and so the 
> controller messing with it wouldn't be otherwise noticed..

Yeap, I just wanted to point out (so the FWIW) that seemingly innocent 
ahci does mangle with some part of the FIS given in the memory.  I agree 
that this is much unlikely with the FUA bit.

>> So, actually, I was thinking about *always* using the non-NCQ FUA 
>> opcode.  As currently implemented, FUA request is always issued by 
>> itself, so NCQ doesn't make any difference there.  So, I think it 
>> would be better to turn on FUA on driver-by-driver basis whether the 
>> controller supports NCQ or not.
> 
> Unfortunately not all drives that support NCQ support the non-NCQ FUA 
> commands (my Seagates are like this).

And I'm a bit scared to set FUA bit on such drives and trust that it 
will actually do FUA, so our opinions aren't too far away from each 
other.  :-)

> There's definitely a potential advantage to FUA with NCQ - if you have 
> non-synchronous accesses going on concurrently with synchronous ones, if 
> you have to use non-NCQ FUA or flush cache commands, you have to wait 
> for all the IOs of both types to drain out before you can issue the 
> flush (since those can't be overlapped with the NCQ read/writes). And if 
> you can only use flush cache, then you're forcing all the writes to be 
> flushed including the non-synchronous ones you didn't care about. 
> Whether or not the block layer currently exploits this I don't know, but 
> it definitely could.

The current barrier implementation uses the following sequences for 
no-FUA and FUA cases.

1. w/o FUA

normal operation -> barrier issued -> drain IO -> flush -> barrier 
written -> flush -> normal operation resumes

2. w/ FUA

normal operation -> barrier issued -> drain IO -> flush -> barrier 
written / FUA -> normal operation resumes

So, the FUA write is issued by itself.  This isn't really efficient and 
frequent barriers impact the performance badly.  If we can change that 
NCQ FUA will be certainly beneficial.

>> Well, I might be being too paranoid but silent FUA failure would be 
>> really hard to diagnose if that ever happens (and I'm fairly certain 
>> that it will on some firmwares).
> 
> Well, there are also probably drives that ignore flush cache commands or 
>  fail to do other things that they should. There's only so far we can go 
> in coping if the firmware authors are being retarded. If any drive is 
> broken like that we should likely just blacklist NCQ on it entirely as 
> obviously little thought or testing went into the implementation..

FLUSH has been around quite long time now and most drives don't have 
problem with that.  FUA on ATA is still quite new and libata will be the 
first major user of it if we enable it by default.  It just seems too 
easy to ignore that bit and successfully complete the write - there 
isn't any safety net as opposed to using a separate opcode.  So, I'm a 
bit nervous.

Any comments, people?

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ