[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y00riC6UxmLDhI5P@T590>
Date: Mon, 17 Oct 2022 18:16:40 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Chaitanya Kulkarni <chaitanyak@...dia.com>
Cc: "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"axboe@...nel.dk" <axboe@...nel.dk>,
"damien.lemoal@...nsource.wdc.com" <damien.lemoal@...nsource.wdc.com>,
"johannes.thumshirn@....com" <johannes.thumshirn@....com>,
"bvanassche@....org" <bvanassche@....org>,
"shinichiro.kawasaki@....com" <shinichiro.kawasaki@....com>,
"vincent.fu@...sung.com" <vincent.fu@...sung.com>,
"yukuai3@...wei.com" <yukuai3@...wei.com>
Subject: Re: [PATCH] null_blk: allow teardown on request timeout
On Mon, Oct 17, 2022 at 10:04:26AM +0000, Chaitanya Kulkarni wrote:
> On 10/17/22 02:50, Ming Lei wrote:
> > On Mon, Oct 17, 2022 at 09:30:47AM +0000, Chaitanya Kulkarni wrote:
> >>
> >>>> + /*
> >>>> + * Unblock any pending dispatch I/Os before we destroy the device.
> >>>> + * From null_destroy_dev()->del_gendisk() will set GD_DEAD flag
> >>>> + * causing any new I/O from __bio_queue_enter() to fail with -ENODEV.
> >>>> + */
> >>>> + blk_mq_unquiesce_queue(nullb->q);
> >>>> +
> >>>> + null_destroy_dev(nullb);
> >>>
> >>> destroying device is never good cleanup for handling timeout/abort, and it
> >>> should have been the last straw any time.
> >>>
> >>
> >> That is exactly why I've added the rq_abort_limit, so until the limit
> >> is not reached null_abort_work() will not get scheduled and device is
> >> not destroyed.
> >
> > I meant destroying device should only be done iff the normal abort handler
> > can't recover the device, however, your patch simply destroys device
> > without running any abort handling.
> >
>
> I did not understand your comment, can you please elaborate on exactly
> where and which abort handlers needs to be called in this patch before
> null_destroy_nullb() ?
In case of request timeout, there may be something wrong which needs
to be recovered.
>
> the objective of this patch it to simulate the teardown scenario
> from timeout handler so it can get tested on regular basis with
> null_blk ...
Why does teardown scenario have to be triggered for timeout? That
looks you think teardown & destroying device for timeout is one normal
and common way, but I think it is not, the device shouldn't be removed
if it still can work. I have got such kind of complaints of disk
disappeared just by request timeout, such as, nvme-pci.
thanks,
Ming
Powered by blists - more mailing lists