[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1195926253.3195.16.camel@localhost.localdomain>
Date: Sat, 24 Nov 2007 19:44:13 +0200
From: James Bottomley <James.Bottomley@...elEye.com>
To: Hannes Reinecke <hare@...e.de>
Cc: Laurent Riffard <laurent.riffard@...e.fr>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-ide@...r.kernel.org,
linux-scsi@...r.kernel.org
Subject: Re: 2.6.24-rc3-mm1: I/O error, system hangs
Probing intermittent failures in Domain Validation, even with the fixes
applied leads me to the conclusion that there are further problems with
this commit:
commit fc5eb4facedbd6d7117905e775cee1975f894e79
Author: Hannes Reinecke <hare@...e.de>
Date: Tue Nov 6 09:23:40 2007 +0100
[SCSI] Do not requeue requests if REQ_FAILFAST is set
The essence of the problems is that you're causing REQ_FAILFAST to
terminate commands with error on requeuing conditions, some of which are
relatively common on most SCSI devices. While this may be the correct
behaviour for multi-path, it's certainly wrong for the previously
understood meaning of REQ_FAILFAST, which was don't retry on error,
which is why domain validation and other applications use it to control
error handling, but don't expect to get failures for a simple requeue
are now spitting errors.
I honestly can't see that, even for the multi-path case, returning an
error when we're over queue depth is the correct thing to do (it may not
matter to something like a symmetrix, but an array that has a non-zero
cost associated with a path change, like a CPQ HSV or the AVT
controllers, will show fairly large slow downs if you do this). Even if
this is the desired behaviour (and I think that's a policy issue),
DID_NO_CONNECT is almost certainly the wrong error to be sending back.
This patch fixes up domain validation to work again correctly, however,
I really think it's just a bandaid. Do you want to rethink the above
commit?
James
Index: BUILD-2.6/drivers/scsi/scsi_lib.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_lib.c 2007-11-24 11:25:20.000000000 -0600
+++ BUILD-2.6/drivers/scsi/scsi_lib.c 2007-11-24 11:26:22.000000000 -0600
@@ -1552,7 +1552,8 @@ static void scsi_request_fn(struct reque
break;
if (!scsi_dev_queue_ready(q, sdev)) {
- if (req->cmd_flags & REQ_FAILFAST) {
+ if ((req->cmd_flags & REQ_FAILFAST) &&
+ !(req->cmd_flags & REQ_PREEMPT)) {
scsi_kill_request(req, q);
continue;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists