[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <47826E63.9070008@cs.wisc.edu>
Date: Mon, 07 Jan 2008 12:24:35 -0600
From: Mike Christie <michaelc@...wisc.edu>
To: James Bottomley <James.Bottomley@...senPartnership.com>
CC: Hannes Reinecke <hare@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Gabriel C <nix.or.die@...glemail.com>,
linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org
Subject: Re: Multipath failover handling (Was: Re: 2.6.24-rc3-mm1)
James Bottomley wrote:
>>> However, there's still devloss_tmo to consider ... even in
>>> multipath, I don't think you want to signal path failure until
>>> devloss_tmo has fired otherwise you'll get too many transient up/down
>>> events which damage performance if the array has an expensive failover
>>> model.
>>>
>> Yes. But currently we have a very high failover latency as we always have
>> to wait for the requeued commands to time-out.
>> Hence we're damaging performance on arrays with inexpensive failover.
>
> If it's a either/or choice between the two that's showing our current
> approach to multi-path is broken.
>
>>> The other problem is what to do with in-flight commands at the time the
>>> link went down. With your current patch, they're still stuck until they
>>> time out ... surely there needs to be some type of recovery mechanism
>>> for these?
>>>
>> Well, the in-flight commands are owned by the HBA driver, which should
>> have the proper code to terminate / return those commands with the
>> appriopriate codes. They will then be rescheduled and will be caught
>> like 'normal' IO requests.
>
> But my point is that if a driver goes blocked, those commands will be
> forced to wait the blocked timeout anyway, so your proposed patch does
> nothing to improve the case for dm anyway ... you only avoid commands
> stuck when a device goes blocked if by chance its request queue was
> empty.
How about my patches to use new transport error values and make the
iscsi and fc behave the same.
The problem I think Hannes and I are both trying to solve is this:
1. We do not want to wait for dev_loss_tmo seconds for failover.
2. The FC drivers can hook into fast_io_fail_tmo related callouts and
with that set that tmo to a very low value like a couple of seconds if
they are using multipath, so failovers are fast. However, there is a bug
with where when the fast_io_fail_tmo fires requests that made it to the
driver get failed and returned to the multipath layer, but commands in
the blocked request queue are stuck in there until dev_loss_tmo fires.
With my patches here (need to be rediffed and for FC I need to handle
JamesS's comments about not using a new field for the fast_fail_timeout
state bit):
http://marc.info/?l=linux-scsi&m=117399843216280&w=2
http://marc.info/?l=linux-scsi&m=117399544112073&w=2
http://marc.info/?l=linux-scsi&m=117399844316771&w=2
http://marc.info/?l=linux-scsi&m=117400203324693&w=2
http://marc.info/?l=linux-scsi&m=117400203324690&w=2
For FC we can use the fast_io_fail_tmo for fast failovers, and commands
will not get stuck in a blocked queue for dev_loss_tmo seconds because
when the fast_io_fail_tmo fires the target's queues are unblocked and
fc_remote_port_chkready() ready kicks in (iSCSI does the same with the
patches in the links). And with the patches if multipath-tools is
sending its path testing IO it will get a DID_TRANSPORT_* error code
that it can use to make a decent path failing decision with.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists