[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06268327-cfed-f266-34a7-fda69411ef2a@huawei.com>
Date: Tue, 26 Sep 2023 20:57:11 +0800
From: Wenchao Hao <haowenchao2@...wei.com>
To: Mike Christie <michael.christie@...cle.com>,
Christoph Hellwig <hch@...radead.org>
CC: "James E . J . Bottomley" <jejb@...ux.ibm.com>,
"Martin K . Petersen" <martin.petersen@...cle.com>,
<linux-scsi@...r.kernel.org>, Hannes Reinecke <hare@...e.de>,
<linux-kernel@...r.kernel.org>, <louhongxiang@...wei.com>,
<lixiaokeng@...wei.com>
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle
mechanism
On 2023/9/26 1:54, Mike Christie wrote:
> On 9/25/23 10:07 AM, Wenchao Hao wrote:
>> On 2023/9/25 22:55, Christoph Hellwig wrote:
>>> Before we add another new error handling mechanism we need to fix the
>>> old one first. Hannes' work on not passing the scsi_cmnd to the various
>>> reset handlers hasn't made a lot of progress in the last five years and
>>> we'll need to urgently fix that first before adding even more
>>> complexity.
>>>
>> I observed Hannes's patches posted about one year ago, it has not been
>> applied yet. I don't know if he is still working on it.
>>
>> My patches do not depend much on that work, I think the conflict can be
>> solved fast between two changes.
>
> I think we want to figure out Hannes's patches first.
>
> For a new EH design we will want to be able to do multiple TMFs in parallel
> on the same host/target right?
>
It's not necessary to do multiple TMFs in parallel, it's ok to make sure
each TMFs do not affect each other.
For example, we have two devices: 0:0:0:0 and 0:0:0:1
Both of them request device reset, they do not happened in parallel, but
would in serial. If 0:0:0:0 is performing device reset in progress, 0:0:0:1
just wait 0:0:0:0 to finish.
> The problem is that we need to be able to make forward progress in the EH
> path and not fail just because we can't allocate memory for a TMF related
> struct. To accomplish this now, drivers will use mempools, preallocate TMF
> related structs/mem/tags with their scsi_cmnd related structs, preallocate
> per host/target/device related structs or ignore what I wrote above and just
> fail.
>
> Hannes's patches fix up the eh callouts so they don't pass in a scsi_cmnd
> when it's not needed. That seems nice because after that, then for your new
> EH we can begin to standardize on how to handle preallocation of drivers
> resources needed to perform TMFs for your new EH. It could be a per
> device/target/host callout to allow drivers to preallocate, then scsi-ml calls
> into the drivers with that data. It doesn't have to be exactly like that or
> anything close. It would be nice for drivers to not have to think about this
> type of thing and scsi-ml just to handle the resource management for us when
> there are multiple TMFs in progress.
>
Powered by blists - more mailing lists