[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070927233437.GF18686@austin.ibm.com>
Date: Thu, 27 Sep 2007 18:34:37 -0500
From: linas@...tin.ibm.com (Linas Vepstas)
To: Matthew Wilcox <matthew@....cx>
Cc: linux-scsi@...r.kernel.org, linuxppc-dev@...abs.org,
linux-kernel@...r.kernel.org, linux-pci@...ey.karlin.mff.cuni.cz
Subject: Re: [PATCH 2/2]: PCI Error Recovery: Symbios SCSI First Failure
On Thu, Sep 27, 2007 at 04:10:31PM -0600, Matthew Wilcox wrote:
> In the error handler, we wait_for_completion(io_reset_wait).
> In sym2_io_error_detected, we init_completion(io_reset_wait).
> Isn't it possible that we hit the error handler before we hit the
> io_error_detected path, and thus the completion wait is lost?
> Since the completion is already initialised in sym_attach(), I don't
> think we need to initialise it in sym2_io_error_detected().
> Makes sense to just delete it?
Good catch. But no ... and I had to study this a bit. Bear with me:
It is enough to call init_completion() once, and not once per use:
it initializes spinlocks, which shouldn't be intialized twice.
But, that completion might be used multiple times when there are
multiple errors, and so, before using it a second time, one must
set completion->done = 0. The INIT_COMPLETION() macro does this.
One must have completion->done = 0 before every use, as otherwise,
wait_for_completion() won't actually wait. And since complete_all()
sets x->done += UINT_MAX/2, I'm pretty sure x->done won't be zero
the next time we use it, unless we make it so.
So I need to find a place to safely call INIT_COMPLETION() again,
after the completion has been used. At the moment, I'm stumped
as to where to do this.
---- [think ... think ... think] ----
I think the race you describe above is harmless. The first time
that sym_eh_handler() will run, it will be with SYM_EH_ABORT,
in it doesn't matter if we lose that, since the device is hosed
anyway. At some later time, it will run with SYM_EH_DEVICE_RESET
and then SYM_EH_BUS_RESET and then SYM_EH_HOST_RESET, and we won't
miss those, since, by now, sym2_io_error_detected() will have run.
So, by my reading, I'd say that init_completion() in
sym2_io_error_detected() has to stay (although perhaps
it should be replaced by the INIT_COMPLETION() macro.)
Removing it will prevent correct operation on the second
and subsequent errors.
--Linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists