[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ed3ec0aa-04a6-43bc-9a1d-5c15a5a46643@nvidia.com>
Date: Mon, 10 Jun 2024 03:44:44 +0000
From: Chaitanya Kulkarni <chaitanyak@...dia.com>
To: Gulam Mohamed <gulam.mohamed@...cle.com>, "linux-block@...r.kernel.org"
<linux-block@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
CC: "yukuai1@...weicloud.com" <yukuai1@...weicloud.com>, "hch@....de"
<hch@....de>, "axboe@...nel.dk" <axboe@...nel.dk>
Subject: Re: [PATCH V4 for-6.10/block] loop: Fix a race between loop detach
and loop open
On 6/7/24 12:06, Gulam Mohamed wrote:
> 1. Userspace sends the command "losetup -d" which uses the open() call
> to open the device
> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
> function loop_clr_fd()
> 3. If LOOP_CLR_FD is the first command received at the time, then the
> AUTOCLEAR flag is not set and deletion of the
> loop device proceeds ahead and scans the partitions (drop/add
> partitions)
>
> if (disk_openers(lo->lo_disk) > 1) {
> lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> loop_global_unlock(lo, true);
> return 0;
> }
>
> 4. Before scanning partitions, it will check to see if any partition of
> the loop device is currently opened
> 5. If any partition is opened, then it will return EBUSY:
>
> if (disk->open_partitions)
> return -EBUSY;
> 6. So, after receiving the "LOOP_CLR_FD" command and just before the above
> check for open_partitions, if any other command
> (like blkid) opens any partition of the loop device, then the partition
> scan will not proceed and EBUSY is returned as shown in above code
> 7. But in "__loop_clr_fd()", this EBUSY error is not propagated
> 8. We have noticed that this is causing the partitions of the loop to
> remain stale even after the loop device is detached resulting in the
> IO errors on the partitions
> Fix:
> Defer the detach of loop device to release function, which is called
> when the last close happens, by setting the lo_flags to LO_FLAGS_AUTOCLEAR
> at the time of detach i.e in loop_clr_fd() function.
>
> Test case involves the following two scripts:
>
> script1.sh:
>
> while [ 1 ];
> do
> losetup -P -f /home/opt/looptest/test10.img
> blkid /dev/loop0p1
> done
>
> script2.sh:
>
> while [ 1 ];
> do
> losetup -d /dev/loop0
> done
>
> Without fix, the following IO errors have been observed:
>
> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
> phys_seg 1 prio class 0
> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
> phys_seg 1 prio class 0
> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
> read
>
> Signed-off-by: Gulam Mohamed <gulam.mohamed@...cle.com>
> ---
>
Looks good.
Reviewed-by: Chaitanya Kulkarni <kch@...dia.com>
-ck
I did run blktests realted to this patch [1] without this patch I can
following messages :-
[ 320.404176] __loop_clr_fd: partition scan of loop0 failed (rc=-16)
[ 322.908994] __loop_clr_fd: partition scan of loop0 failed (rc=-16)
with this patch applied, these messages are gone when ran same test
posted in [1] ..
[1]
https://lore.kernel.org/all/ymanwmgtn76jg56vmjbg5vxcegfng2ewccgntmtzskwl6qx42d@g3iyvqldgais/T/
Powered by blists - more mailing lists