linux-kernel - Re: [PATCH V4 for-6.10/block] loop: Fix a race between loop detach and loop open

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d9a00203-a744-4d9e-b86a-51703e1a135a@linux.dev>
Date: Mon, 31 Mar 2025 16:10:47 +0200
From: Zhu Yanjun <yanjun.zhu@...ux.dev>
To: Gulam Mohamed <gulam.mohamed@...cle.com>, linux-block@...r.kernel.org,
 linux-kernel@...r.kernel.org
Cc: yukuai1@...weicloud.com, hch@....de, axboe@...nel.dk
Subject: Re: [PATCH V4 for-6.10/block] loop: Fix a race between loop detach
 and loop open

On 07.06.24 21:06, Gulam Mohamed wrote:
> 1. Userspace sends the command "losetup -d" which uses the open() call
>     to open the device
> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
>     function loop_clr_fd()
> 3. If LOOP_CLR_FD is the first command received at the time, then the
>     AUTOCLEAR flag is not set and deletion of the
>     loop device proceeds ahead and scans the partitions (drop/add
>     partitions)
> 
>          if (disk_openers(lo->lo_disk) > 1) {
>                  lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
>                  loop_global_unlock(lo, true);
>                  return 0;
>          }
> 
>   4. Before scanning partitions, it will check to see if any partition of
>      the loop device is currently opened
>   5. If any partition is opened, then it will return EBUSY:
> 
>      if (disk->open_partitions)
>                  return -EBUSY;
>   6. So, after receiving the "LOOP_CLR_FD" command and just before the above
>      check for open_partitions, if any other command
>      (like blkid) opens any partition of the loop device, then the partition
>      scan will not proceed and EBUSY is returned as shown in above code
>   7. But in "__loop_clr_fd()", this EBUSY error is not propagated
>   8. We have noticed that this is causing the partitions of the loop to
>      remain stale even after the loop device is detached resulting in the
>      IO errors on the partitions
> Fix:
> Defer the detach of loop device to release function, which is called
> when the last close happens, by setting the lo_flags to LO_FLAGS_AUTOCLEAR
> at the time of detach i.e in loop_clr_fd() function.
> 
> Test case involves the following two scripts:
> 
> script1.sh:
> 
> while [ 1 ];
> do
>          losetup -P -f /home/opt/looptest/test10.img
>          blkid /dev/loop0p1
> done
> 
> script2.sh:
> 
> while [ 1 ];
> do
>          losetup -d /dev/loop0
> done
> 
> Without fix, the following IO errors have been observed:
> 
> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
>          phys_seg 1 prio class 0
> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
>          phys_seg 1 prio class 0
> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
>          read
> 
> Signed-off-by: Gulam Mohamed <gulam.mohamed@...cle.com>

This is for v6.10 stable, should add "Cc: stable@...r.kernel.org"?
So the engineers who work the stable branch can find this commit and 
apply it?

Reviewed-by: Zhu Yanjun <yanjun.zhu@...ux.dev>

Zhu Yanjun

> ---
> v4<-v3:
> 1. Defer the loop detach to last close of loop device
> 2. Removed the use of lo_open due to following reasons:
> 
> Setting the lo_state to Lo_rundown in loop_clr_fd() may not help in
> stopping the incoming open(), when the loop is being detached, as the
> open() could invoke the lo_open() before the lo_state is set to Lo_rundown
> and increment the disk_openers refcnt later.
> As the actual cleanup is deferred to last close, in release, there is no
> chance for the open() to kick in to take the reference. Because both open()
> and release() are protected by open_mutex and hence they cannot run in
> parallel.
> So, lo_open() and setting lo_state to Lo_rundown is not needed. Removing
> the loop state Lo_rundown as its not used anymore.
> 
>   drivers/block/loop.c | 44 ++++++++------------------------------------
>   1 file changed, 8 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 28a95fd366fe..4936cadc1a63 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -41,7 +41,6 @@
>   enum {
>   	Lo_unbound,
>   	Lo_bound,
> -	Lo_rundown,
>   	Lo_deleting,
>   };
>   
> @@ -1131,7 +1130,7 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode,
>   	return error;
>   }
>   
> -static void __loop_clr_fd(struct loop_device *lo, bool release)
> +static void __loop_clr_fd(struct loop_device *lo)
>   {
>   	struct file *filp;
>   	gfp_t gfp = lo->old_gfp_mask;
> @@ -1139,14 +1138,6 @@ static void __loop_clr_fd(struct loop_device *lo, bool release)
>   	if (test_bit(QUEUE_FLAG_WC, &lo->lo_queue->queue_flags))
>   		blk_queue_write_cache(lo->lo_queue, false, false);
>   
> -	/*
> -	 * Freeze the request queue when unbinding on a live file descriptor and
> -	 * thus an open device.  When called from ->release we are guaranteed
> -	 * that there is no I/O in progress already.
> -	 */
> -	if (!release)
> -		blk_mq_freeze_queue(lo->lo_queue);
> -
>   	spin_lock_irq(&lo->lo_lock);
>   	filp = lo->lo_backing_file;
>   	lo->lo_backing_file = NULL;
> @@ -1164,8 +1155,6 @@ static void __loop_clr_fd(struct loop_device *lo, bool release)
>   	mapping_set_gfp_mask(filp->f_mapping, gfp);
>   	/* This is safe: open() is still holding a reference. */
>   	module_put(THIS_MODULE);
> -	if (!release)
> -		blk_mq_unfreeze_queue(lo->lo_queue);
>   
>   	disk_force_media_change(lo->lo_disk);
>   
> @@ -1180,11 +1169,7 @@ static void __loop_clr_fd(struct loop_device *lo, bool release)
>   		 * must be at least one and it can only become zero when the
>   		 * current holder is released.
>   		 */
> -		if (!release)
> -			mutex_lock(&lo->lo_disk->open_mutex);
>   		err = bdev_disk_changed(lo->lo_disk, false);
> -		if (!release)
> -			mutex_unlock(&lo->lo_disk->open_mutex);
>   		if (err)
>   			pr_warn("%s: partition scan of loop%d failed (rc=%d)\n",
>   				__func__, lo->lo_number, err);
> @@ -1232,25 +1217,8 @@ static int loop_clr_fd(struct loop_device *lo)
>   		loop_global_unlock(lo, true);
>   		return -ENXIO;
>   	}
> -	/*
> -	 * If we've explicitly asked to tear down the loop device,
> -	 * and it has an elevated reference count, set it for auto-teardown when
> -	 * the last reference goes away. This stops $!~#$@ udev from
> -	 * preventing teardown because it decided that it needs to run blkid on
> -	 * the loopback device whenever they appear. xfstests is notorious for
> -	 * failing tests because blkid via udev races with a losetup
> -	 * <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d
> -	 * command to fail with EBUSY.
> -	 */
> -	if (disk_openers(lo->lo_disk) > 1) {
> -		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> -		loop_global_unlock(lo, true);
> -		return 0;
> -	}
> -	lo->lo_state = Lo_rundown;
> +	lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
>   	loop_global_unlock(lo, true);
> -
> -	__loop_clr_fd(lo, false);
>   	return 0;
>   }
>   
> @@ -1724,15 +1692,19 @@ static void lo_release(struct gendisk *disk)
>   	if (disk_openers(disk) > 0)
>   		return;
>   
> +	/*
> +	 * Clear the backing device information if this is the last close of
> +	 * a device that's been marked for auto clear, or on which LOOP_CLR_FD
> +	 * has been called.
> +	 */
>   	mutex_lock(&lo->lo_mutex);
>   	if (lo->lo_state == Lo_bound && (lo->lo_flags & LO_FLAGS_AUTOCLEAR)) {
> -		lo->lo_state = Lo_rundown;
>   		mutex_unlock(&lo->lo_mutex);
>   		/*
>   		 * In autoclear mode, stop the loop thread
>   		 * and remove configuration after last close.
>   		 */
> -		__loop_clr_fd(lo, true);
> +		__loop_clr_fd(lo);
>   		return;
>   	}
>   	mutex_unlock(&lo->lo_mutex);