[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47B08F6B.2020502@redhat.com>
Date: Mon, 11 Feb 2008 13:09:47 -0500
From: Jarod Wilson <jwilson@...hat.com>
To: Stefan Richter <stefanr@...6.in-berlin.de>
CC: linux1394-devel@...ts.sourceforge.net, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 9/9] firewire: fw-sbp2: fix I/O errors during reconnect
Stefan Richter wrote:
> While fw-sbp2 takes the necessary time to reconnect to a logical unit
> after bus reset, the SCSI core keeps sending new commands. They are all
> immediately completed with host busy status, and application clients or
> filesystems will break quickly. The SCSI device might even be taken
> offline: http://bugzilla.kernel.org/show_bug.cgi?id=9734
>
> The only remedy seems to be to block the SCSI device until reconnect.
> Alas the SCSI core has no useful API to block only one logical unit i.e.
> the scsi_device, therefore we block the entire Scsi_Host. This
> currently corresponds to an SBP-2 target. In case of targets with
> multiple logical units, we need to satisfy the dependencies between
> logical units by carefully tracking the blocking state of the target and
> its units. We block all logical units of a target as soon as one of
> them needs to be blocked, and keep them blocked until all of them are
> ready to be unblocked.
>
> Furthermore, as the history of the old sbp2 driver has shown, the
> scsi_block_requests() API is a minefield with high potential of
> deadlocks. We therefore take extra measures to keep logical units
> unblocked during __scsi_add_device() and during shutdown.
>
> Signed-off-by: Stefan Richter <stefanr@...6.in-berlin.de>
> +/*
> + * Blocks lu->tgt if all of the following conditions are met:
> + * - Login, INQUIRY, and high-level SCSI setup of all logical units of the
> + * target have been successfully finished (indicated by dont_block == 0).
> + * - The lu->generation is stale. sbp2_reconnect will unblock lu later.
> + */
> +static void sbp2_conditionally_block(struct sbp2_logical_unit *lu)
> +{
> + struct fw_card *card = fw_device(lu->tgt->unit->device.parent)->card;
> +
> + if (!atomic_read(&lu->tgt->dont_block) &&
> + lu->generation != card->generation &&
> + atomic_cmpxchg(&lu->blocked, 0, 1) == 0) {
Just to be absolutely sure, we don't need any barriers here to ensure we
get the right generations, do we?
Also, this isn't expected to let I/O survive a disk being unplugged
briefly, then plugged back in, is it? (I recall that being discussed,
but I think it was as a 'would be nice to do in the future' thing).
--
Jarod Wilson
jwilson@...hat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists