lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150504022816.GB14452@blaptop>
Date:	Mon, 4 May 2015 11:28:17 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Nitin Gupta <ngupta@...are.org>, linux-kernel@...r.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [PATCHv3 9/9] zram: add dynamic device add/remove functionality

On Mon, May 04, 2015 at 11:20:08AM +0900, Minchan Kim wrote:
> Hello Sergey,
> 
> On Thu, Apr 30, 2015 at 03:51:12PM +0900, Sergey Senozhatsky wrote:
> > On (04/30/15 15:44), Minchan Kim wrote:
> > > > > I think the problem of deadlock is that you are trying to remove sysfs file
> > > > > in sysfs handler.
> > > > > 
> > > > > #> echo 1 > /sys/xxx/zram_remove
> > > > > 
> > > > > kernfs_fop_write - hold s_active
> > > > >   -> zram_remove_store
> > > > >     -> zram_remove
> > > > >       -> sysfs_remove_group - hold s_active *again*
> > > > > 
> > > > > Right?
> > > > > 
> > > > 
> > > > are those same s_active locks?
> > > > 
> > > > 
> > > > we hold (s_active#163) and (&bdev->bd_mutex) and want to acquire (s_active#162)
> > > 
> > > Thanks for sharing the message.
> > > You're right. It's another lock so it shouldn't be a reason.
> > > Okay, I will review it. Please give me time.
> > > 
> > 
> > sure, no problem and no rush. thanks!
> 
> I had a time to think over it.
> 
> I think your patch is rather tricky so someone cannot see sysfs
> although he already opened /dev/zram but after a while he can see sysfs.
> It's weired.
> 
> I want to fix it more generic way. Othewise, we might have trouble with
> locking problem sometime. We already have experieced it with init_lock
> although we finally fixed it.
> 
> I think we can fix it with below patch I hope it's more general and right
> approach. It's based on your [zram: return zram device_id from zram_add()]
> 
> What do you think about?
> 
> From e943df5407b880f9262ef959b270226fdc81bc9f Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@...nel.org>
> Date: Mon, 4 May 2015 08:36:07 +0900
> Subject: [PATCH 1/2] zram: close race by open overriding
> 
> [1] introduced bdev->bd_mutex to protect a race between mount
> and reset. At that time, we don't have dynamic zram-add/remove
> feature so it was okay.
> 
> However, as we introduce dynamic device feature, bd_mutex became
> trouble.
> 
> 	CPU 0
> 
> echo 1 > /sys/block/zram<id>/reset
>   -> kernfs->s_active(A)
>     -> zram:reset_store->bd_mutex(B)
> 
> 	CPU 1
> 
> echo <id> > /sys/class/zram/zram-remove
>   ->zram:zram_remove: bd_mutex(B)
>   -> sysfs_remove_group
>     -> kernfs->s_active(A)
> 
> IOW, AB -> BA deadlock
> 
> The reason we are holding bd_mutex for zram_remove is to prevent
> any incoming open /dev/zram[0-9]. Otherwise, we could remove zram
> others already have opened. But it causes above deadlock problem.
> 
> To fix the problem, this patch overrides block_device.open and
> it returns -EBUSY if zram asserts he claims zram to reset so any
> incoming open will be failed so we don't need to hold bd_mutex
> for zram_remove ayn more.
> 
> This patch is to prepare for zram-add/remove feature.
> 
> [1] ba6b17: zram: fix umount-reset_store-mount race condition
> Signed-off-by: Minchan Kim <minchan@...nel.org>

If above has no problem, we could apply your last patch on top of it.

>From 5bfa8a2e312a9c8493f574b1cf513ef4693a465c Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Date: Mon, 4 May 2015 09:02:23 +0900
Subject: [PATCH 2/2] zram: add dynamic device add/remove functionality

We currently don't support on-demand device creation. The one and only way
to have N zram devices is to specify num_devices module parameter (default
value: 1). IOW if, for some reason, at some point, user wants to have
N + 1 devies he/she must umount all the existing devices, unload the
module, load the module passing num_devices equals to N + 1. And do this
again, if needed.

This patch introduces zram control sysfs class, which has two sysfs
attrs:
- zram_add      -- add a new zram device
- zram_remove   -- remove a specific (device_id) zram device

zram_add sysfs attr is read-only and has only automatic device id
assignment mode (as requested by Minchan Kim). read operation performed
on this attr creates a new zram device and returns back its device_id or
error status.

Usage example:
	# add a new specific zram device
	cat /sys/class/zram-control/zram_add
	2

	# remove a specific zram device
	echo 4 > /sys/class/zram-control/zram_remove

Returning zram_add() error code back to user (-ENOMEM in this case)

	cat /sys/class/zram-control/zram_add
	cat: /sys/class/zram-control/zram_add: Cannot allocate memory

NOTE, there might be users who already depend on the fact that at least
zram0 device gets always created by zram_init(). Preserve this behavior.

[minchan]: use zram->claim to avoid lockdep splat
Reported-by: Minchan Kim <minchan@...nel.org>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
---
 Documentation/blockdev/zram.txt | 23 ++++++++--
 drivers/block/zram/zram_drv.c   | 97 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 114 insertions(+), 6 deletions(-)

diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 65e9430..fc686d4 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -99,7 +99,24 @@ size of the disk when not in use so a huge zram is wasteful.
 	mkfs.ext4 /dev/zram1
 	mount /dev/zram1 /tmp
 
-7) Stats:
+7) Add/remove zram devices
+
+zram provides a control interface, which enables dynamic (on-demand) device
+addition and removal.
+
+In order to add a new /dev/zramX device, perform read operation on zram_add
+attribute. This will return either new device's device id (meaning that you
+can use /dev/zram<id>) or error code.
+
+Example:
+	cat /sys/class/zram-control/zram_add
+	1
+
+To remove the existing /dev/zramX device (where X is a device id)
+execute
+	echo X > /sys/class/zram-control/zram_remove
+
+8) Stats:
 Per-device statistics are exported as various nodes under /sys/block/zram<id>/
 
 A brief description of exported device attritbutes. For more details please
@@ -174,11 +191,11 @@ line of text and contains the following stats separated by whitespace:
 	zero_pages
 	num_migrated
 
-8) Deactivate:
+9) Deactivate:
 	swapoff /dev/zram0
 	umount /dev/zram1
 
-9) Reset:
+10) Reset:
 	Write any positive value to 'reset' sysfs node
 	echo 1 > /sys/block/zram0/reset
 	echo 1 > /sys/block/zram1/reset
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 7fb72dc..97cd4f3 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -29,10 +29,14 @@
 #include <linux/vmalloc.h>
 #include <linux/err.h>
 #include <linux/idr.h>
+#include <linux/sysfs.h>
 
 #include "zram_drv.h"
 
 static DEFINE_IDR(zram_index_idr);
+/* idr index must be protected */
+static DEFINE_MUTEX(zram_index_mutex);
+
 static int zram_major;
 static const char *default_compressor = "lzo";
 
@@ -1278,24 +1282,101 @@ out_free_dev:
 	return ret;
 }
 
-static void zram_remove(struct zram *zram)
+static int zram_remove(struct zram *zram)
 {
-	pr_info("Removed device: %s\n", zram->disk->disk_name);
+	struct block_device *bdev;
+
+	bdev = bdget_disk(zram->disk, 0);
+	if (!bdev)
+		return -ENOMEM;
+
+	mutex_lock(&bdev->bd_mutex);
+	if (bdev->bd_openers || zram->claim) {
+		mutex_unlock(&bdev->bd_mutex);
+		return -EBUSY;
+	}
+
+	zram->claim = true;
+	mutex_unlock(&bdev->bd_mutex);
+
 	/*
 	 * Remove sysfs first, so no one will perform a disksize
-	 * store while we destroy the devices
+	 * store while we destroy the devices. This also helps during
+	 * zram_remove() -- device_reset() is the last holder of
+	 * ->init_lock.
 	 */
 	sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
 			&zram_disk_attr_group);
 
+	/* Make sure all pending I/O is finished */
+	fsync_bdev(bdev);
 	zram_reset_device(zram);
+	mutex_unlock(&bdev->bd_mutex);
+
+	pr_info("Removed device: %s\n", zram->disk->disk_name);
+
 	idr_remove(&zram_index_idr, zram->disk->first_minor);
 	blk_cleanup_queue(zram->disk->queue);
 	del_gendisk(zram->disk);
 	put_disk(zram->disk);
 	kfree(zram);
+
+	return 0;
 }
 
+/* zram module control sysfs attributes */
+static ssize_t zram_add_show(struct class *class,
+			struct class_attribute *attr,
+			char *buf)
+{
+	int ret;
+
+	mutex_lock(&zram_index_mutex);
+	ret = zram_add();
+	mutex_unlock(&zram_index_mutex);
+
+	if (ret < 0)
+		return ret;
+	return scnprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t zram_remove_store(struct class *class,
+			struct class_attribute *attr,
+			const char *buf,
+			size_t count)
+{
+	struct zram *zram;
+	int ret, dev_id;
+
+	/* dev_id is gendisk->first_minor, which is `int' */
+	ret = kstrtoint(buf, 10, &dev_id);
+	if (ret || dev_id < 0)
+		return -EINVAL;
+
+	mutex_lock(&zram_index_mutex);
+
+	zram = idr_find(&zram_index_idr, dev_id);
+	if (zram)
+		ret = zram_remove(zram);
+	else
+		ret = -ENODEV;
+
+	mutex_unlock(&zram_index_mutex);
+	return ret ? ret : count;
+}
+
+static struct class_attribute zram_control_class_attrs[] = {
+	__ATTR_RO(zram_add),
+	__ATTR_WO(zram_remove),
+	__ATTR_NULL,
+};
+
+static struct class zram_control_class = {
+	.name		= "zram-control",
+	.owner		= THIS_MODULE,
+	.class_attrs	= zram_control_class_attrs,
+};
+
 static int zram_remove_cb(int id, void *ptr, void *data)
 {
 	zram_remove(ptr);
@@ -1304,6 +1385,7 @@ static int zram_remove_cb(int id, void *ptr, void *data)
 
 static void destroy_devices(void)
 {
+	class_unregister(&zram_control_class);
 	idr_for_each(&zram_index_idr, &zram_remove_cb, NULL);
 	idr_destroy(&zram_index_idr);
 	unregister_blkdev(zram_major, "zram");
@@ -1313,14 +1395,23 @@ static int __init zram_init(void)
 {
 	int ret;
 
+	ret = class_register(&zram_control_class);
+	if (ret) {
+		pr_warn("Unable to register zram-control class\n");
+		return ret;
+	}
+
 	zram_major = register_blkdev(0, "zram");
 	if (zram_major <= 0) {
 		pr_warn("Unable to get major number\n");
+		class_unregister(&zram_control_class);
 		return -EBUSY;
 	}
 
 	while (num_devices != 0) {
+		mutex_lock(&zram_index_mutex);
 		ret = zram_add();
+		mutex_unlock(&zram_index_mutex);
 		if (ret < 0)
 			goto out_error;
 		num_devices--;
-- 
1.9.3

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ