linux-kernel - Re: Is BIO_RW_FAILFAST really usable?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20071204091334.GG23294@kernel.dk>
Date:	Tue, 4 Dec 2007 10:13:34 +0100
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Neil Brown <neilb@...e.de>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: Is BIO_RW_FAILFAST really usable?

On Tue, Dec 04 2007, Neil Brown wrote:
> 
> I've been looking at use BIO_RW_FAILFAST in md/raid to improve
> handling of some error cases.
> 
> This is particularly significant for the DASD driver (s390 specific).
> I believe it uses optic fibre to connect to the drives.  When one of
> these paths is unplugged, IO requests will block until an operator
> runs a command to reset the card (or until it is plugged back in).
> The only way to avoid this blockage is to use BIO_RW_FAILFAST.  So
> we really need BIO_RW_FAILFAST for a reliable RAID1 configuration on
> DASD drives.
> 
> However, I just tested BIO_RW_FAILFAST on my SATA drives: controller 
> 
> 02:06.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
> 
> (not using the cards minimal RAID functionality) and requests fail
> immediately and always with e.g.
> 
> sd 2:0:0:0: [sdc] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sdc, sector 2048
> 
> So fail fast obviously isn't generally usable.
> 
> What is the answer here?  Is the Silicon Image driver doing the wrong
> thing, or is DASD doing the wrong thing, or is BIO_RW_FAILFAST
> under-specified and we really need multiple flags or what?

Hrmpf. It looks like the SCSI layer is a little too trigger happy. Any
chance you could try and trace where this happens?

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/