linux-kernel - Re: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop detach and loop open

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c7eb562c-97ae-455a-3859-0ed28ebdf7ae@huaweicloud.com>
Date: Wed, 22 May 2024 10:31:01 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Jens Axboe <axboe@...nel.dk>, Gulam Mohamed <gulam.mohamed@...cle.com>,
 linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: shinichiro.kawasaki@....com, chaitanyak@...dia.com, hch@....de,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop
 detach and loop open

Hi,

在 2024/05/22 9:39, Jens Axboe 写道:
> On 5/21/24 4:42 PM, Gulam Mohamed wrote:
>> Description
>> ===========
>>
>> 1. Userspace sends the command "losetup -d" which uses the open() call
>>     to open the device
>> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
>>     function loop_clr_fd()
>> 3. If LOOP_CLR_FD is the first command received at the time, then the
>>     AUTOCLEAR flag is not set and deletion of the
>>     loop device proceeds ahead and scans the partitions (drop/add
>>     partitions)
>>
>> 	if (disk_openers(lo->lo_disk) > 1) {
>> 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
>> 		loop_global_unlock(lo, true);
>> 		return 0;
>> 	}
>>
>>   4. Before scanning partitions, it will check to see if any partition of
>>      the loop device is currently opened
>>   5. If any partition is opened, then it will return EBUSY:
>>
>>      if (disk->open_partitions)
>> 		return -EBUSY;
>>   6. So, after receiving the "LOOP_CLR_FD" command and just before the above
>>      check for open_partitions, if any other command
>>      (like blkid) opens any partition of the loop device, then the partition
>>      scan will not proceed and EBUSY is returned as shown in above code
>>   7. But in "__loop_clr_fd()", this EBUSY error is not propagated
>>   8. We have noticed that this is causing the partitions of the loop to
>>      remain stale even after the loop device is detached resulting in the
>>      IO errors on the partitions
>>
>> Fix
>> ---
>> Re-introduce the lo_open() call to restrict any process to open the loop
>> device when its being detached
>>
>> Test case
>> =========
>> Test case involves the following two scripts:
>>
>> script1.sh
>> ----------
>> while [ 1 ];
>> do
>> 	losetup -P -f /home/opt/looptest/test10.img
>> 	blkid /dev/loop0p1
>> done
>>
>> script2.sh
>> ----------
>> while [ 1 ];
>> do
>> 	losetup -d /dev/loop0
>> done
>>
>> Without fix, the following IO errors have been observed:
>>
>> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
>> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
>>          phys_seg 1 prio class 0
>> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
>>          phys_seg 1 prio class 0
>> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
>>          read
>>
>> V1->V2:
>> 	Added a test case, 010, in blktests in tests/loop/
>> Signed-off-by: Gulam Mohamed <gulam.mohamed@...cle.com>
>> ---
>>   drivers/block/loop.c | 19 +++++++++++++++++++
>>   1 file changed, 19 insertions(+)
>>
>> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
>> index 28a95fd366fe..9a235d8c062d 100644
>> --- a/drivers/block/loop.c
>> +++ b/drivers/block/loop.c
>> @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode,
>>   }
>>   #endif
>>   
>> +static int lo_open(struct gendisk *disk, blk_mode_t mode)
>> +{
>> +        struct loop_device *lo = disk->private_data;
>> +        int err;
>> +
>> +        if (!lo)
>> +                return -ENXIO;
>> +
>> +        err = mutex_lock_killable(&lo->lo_mutex);
>> +        if (err)
>> +                return err;
>> +
>> +        if (lo->lo_state == Lo_rundown)
>> +                err = -ENXIO;
>> +        mutex_unlock(&lo->lo_mutex);

This doesn't fix the problem completely, there is still a race window.

lo_release
  if (disk_openers(disk) > 0)
   reutrn
   -> no openers now
		lo_open
		 if (lo->lo_state == Lo_rundown)
		 -> no set yet
		 open succeed
  mutex_lock(lo_mutex)
  lo->lo_state = Lo_rundown
  mutex_unlock(lo_mutex)
  __loop_clr_fd

And with the respect, loop_clr_fd() has the same problem.

I think probably loop need a open counter for itself.

Thanks,
Kuai

>> +	return err;
>> +}
> 
> Most of this function uses spaces rather than tabs.
>