[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87lefv622n.fsf@linux.ibm.com>
Date: Tue, 04 Jul 2023 19:04:00 +0200
From: "Marc Hartmayer" <mhartmay@...ux.ibm.com>
To: Yu Kuai <yukuai1@...weicloud.com>
Cc: linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
yi.zhang@...wei.com, yangerkun@...wei.com, hch@....de,
chaitanyak@...dia.com, shinichiro.kawasaki@....com,
dgilbert@...erlog.com, jejb@...ux.ibm.com,
martin.petersen@...cle.com, yukuai3@...wei.com, axboe@...nel.dk,
Benjamin Block <bblock@...ux.ibm.com>
Subject: Re: [PATCH] scsi/sg: don't grab scsi host module reference
On Thu, Jun 22, 2023 at 12:01 AM +0800, Yu Kuai <yukuai1@...weicloud.com> wrote:
> From: Yu Kuai <yukuai3@...wei.com>
>
> In order to prevent request_queue to be freed before cleaning up
> blktrace debugfs entries, commit db59133e9279 ("scsi: sg: fix blktrace
> debugfs entries leakage") use scsi_device_get(), however,
> scsi_device_get() will also grab scsi module reference and scsi module
> can't be removed.
>
> It's reported that blktests can't unload scsi_debug after block/001:
>
> blktests (master) # ./check block
> block/001 (stress device hotplugging) [failed]
> +++ /root/blktests/results/nodev/block/001.out.bad 2023-06-19
> Running block/001
> Stressing sd
> +modprobe: FATAL: Module scsi_debug is in use.
>
> Fix this problem by grabbing request_queue reference directly, so that
> scsi host module can still be unloaded while request_queue will be
> pinged by sg device.
>
> Reported-by: Chaitanya Kulkarni <chaitanyak@...dia.com>
> Link: https://lore.kernel.org/all/1760da91-876d-fc9c-ab51-999a6f66ad50@nvidia.com/
> Fixes: db59133e9279 ("scsi: sg: fix blktrace debugfs entries leakage")
> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
> ---
> drivers/scsi/sg.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 2433eeef042a..dcb73787c29d 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -1497,7 +1497,7 @@ sg_add_device(struct device *cl_dev)
> int error;
> unsigned long iflags;
>
> - error = scsi_device_get(scsidp);
> + error = blk_get_queue(scsidp->request_queue);
> if (error)
> return error;
>
> @@ -1558,7 +1558,7 @@ sg_add_device(struct device *cl_dev)
> out:
> if (cdev)
> cdev_del(cdev);
> - scsi_device_put(scsidp);
> + blk_put_queue(scsidp->request_queue);
> return error;
> }
>
> @@ -1575,7 +1575,7 @@ sg_device_destroy(struct kref *kref)
> */
>
> blk_trace_remove(q);
> - scsi_device_put(sdp->device);
> + blk_put_queue(q);
>
> write_lock_irqsave(&sg_index_lock, flags);
> idr_remove(&sg_index_idr, sdp->index);
> --
> 2.39.2
Hi,
This change (bisected) triggers a regression in our KVM on s390x CI. The
symptom is that a “scsi_debug device” does not bind to the scsi_generic
driver. On s390x you can reproduce the problem as follows (I have not
tested on x86):
With this patch applied:
$ sudo modprobe scsi_debug
$ # Get the 'scsi_host,channel,target_number,LUN' tuple for the scsi_debug device
$ lsscsi |grep scsi_debug |awk '{ print $1 }'
[0:0:0:0]
$ sudo stat /sys/bus/scsi/devices/0:0:0:0/scsi_generic
stat: cannot statx '/sys/bus/scsi/devices/0:0:0:0/scsi_generic': No such file or directory
Patch reverted:
$ sudo modprobe scsi_debug
$ lsscsi |grep scsi_debug |awk '{ print $1 }'
[0:0:0:0]
$ sudo stat /sys/bus/scsi/devices/0:0:0:0/scsi_generic
File: /sys/bus/scsi/devices/0:0:0:0/scsi_generic
Size: 0 Blocks: 0 IO Block: 4096 directory
Device: 0,20 Inode: 12155 Links: 3
…
Any ideas?
Marc
Powered by blists - more mailing lists