lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210511181558.380764-1-gulam.mohamed@oracle.com>
Date:   Tue, 11 May 2021 18:15:58 +0000
From:   Gulam Mohamed <gulam.mohamed@...cle.com>
To:     viro@...iv.linux.org.uk, axboe@...nel.dk,
        linux-fsdevel@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, hch@....de,
        martin.petersen@...cle.com
Cc:     junxiao.bi@...cle.com, gulam.mohamed@...cle.com
Subject: [PATCH V1 1/1] Fix race between iscsi logout and systemd-udevd

Problem description:

During the kernel patching, customer was switching between the iscsi
disks. To switch between the iscsi disks, it was logging out the
currently connected iscsi disk and then logging in to the new iscsi
disk. This was being done using a script. Customer was also using the
"parted" command in the script to list the partition details just
before the iscsi logout. This usage of "parted" command was creating
an issue and we were seeing stale links of the
disks in /sys/class/block.

Analysis:

As part of iscsi logout, the partitions and the disk will be removed
in the function del_gendisk() which is done through a kworker. The
parted command, used to list the partitions, will open the disk in
RW mode which results in systemd-udevd re-reading the partitions. The
ioctl used to re-read partitions is BLKRRPART. This will trigger the
rescanning of partitions which will also delete and re-add the
partitions. So, both iscsi logout processing (through kworker) and the
"parted" command (through systemd-udevd) will be involved in
add/delete of partitions. In our case, the following sequence of
operations happened (the iscsi device is /dev/sdb with partition sdb1):

1. sdb1 was removed by PARTED
2. kworker, as part of iscsi logout, couldn't remove sdb1 as it was
   already removed by PARTED
3. sdb1 was added by parted
4. sdb was NOW removed as part of iscsi logout (the last part of the
   device removal after remoing the partitions)

Since the symlink /sys/class/block/sdb1 points to
/sys/class/devices/platform/hostx/sessionx/targetx:x/block/sdb/sdb1
and since sdb is already removed, the symlink /sys/class/block/sdb1
will be orphan and stale. So, this stale link is a result of the race
condition in kernel between the systemd-udevd and iscsi-logout
processing as described above. We were able to reproduce this even
with latest upstream kernel.

Fix:

While Dropping/Adding partitions as part of BLKRRPART ioctl, take the
read lock for "bdev_lookup_sem" to sync with del_gendisk().

Signed-off-by: Gulam Mohamed <gulam.mohamed@...cle.com>
---
 fs/block_dev.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 09d6f7229db9..e903a7edfd63 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1245,9 +1245,17 @@ int bdev_disk_changed(struct block_device *bdev, bool invalidate)
 	lockdep_assert_held(&bdev->bd_mutex);
 
 rescan:
+	down_read(&bdev_lookup_sem);
+	if (!(disk->flags & GENHD_FL_UP)) {
+		up_read(&bdev_lookup_sem);
+		return -ENXIO;
+	}
+
 	ret = blk_drop_partitions(bdev);
-	if (ret)
+	if (ret) {
+		up_read(&bdev_lookup_sem);
 		return ret;
+	}
 
 	clear_bit(GD_NEED_PART_SCAN, &disk->state);
 
@@ -1270,8 +1278,10 @@ int bdev_disk_changed(struct block_device *bdev, bool invalidate)
 
 	if (get_capacity(disk)) {
 		ret = blk_add_partitions(disk, bdev);
-		if (ret == -EAGAIN)
+		if (ret == -EAGAIN) {
+			up_read(&bdev_lookup_sem);
 			goto rescan;
+		}
 	} else if (invalidate) {
 		/*
 		 * Tell userspace that the media / partition table may have
@@ -1280,6 +1290,7 @@ int bdev_disk_changed(struct block_device *bdev, bool invalidate)
 		kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE);
 	}
 
+	up_read(&bdev_lookup_sem);
 	return ret;
 }
 /*
-- 
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ