[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150504022816.GB14452@blaptop>
Date: Mon, 4 May 2015 11:28:17 +0900
From: Minchan Kim <minchan@...nel.org>
To: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Nitin Gupta <ngupta@...are.org>, linux-kernel@...r.kernel.org,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [PATCHv3 9/9] zram: add dynamic device add/remove functionality
On Mon, May 04, 2015 at 11:20:08AM +0900, Minchan Kim wrote:
> Hello Sergey,
>
> On Thu, Apr 30, 2015 at 03:51:12PM +0900, Sergey Senozhatsky wrote:
> > On (04/30/15 15:44), Minchan Kim wrote:
> > > > > I think the problem of deadlock is that you are trying to remove sysfs file
> > > > > in sysfs handler.
> > > > >
> > > > > #> echo 1 > /sys/xxx/zram_remove
> > > > >
> > > > > kernfs_fop_write - hold s_active
> > > > > -> zram_remove_store
> > > > > -> zram_remove
> > > > > -> sysfs_remove_group - hold s_active *again*
> > > > >
> > > > > Right?
> > > > >
> > > >
> > > > are those same s_active locks?
> > > >
> > > >
> > > > we hold (s_active#163) and (&bdev->bd_mutex) and want to acquire (s_active#162)
> > >
> > > Thanks for sharing the message.
> > > You're right. It's another lock so it shouldn't be a reason.
> > > Okay, I will review it. Please give me time.
> > >
> >
> > sure, no problem and no rush. thanks!
>
> I had a time to think over it.
>
> I think your patch is rather tricky so someone cannot see sysfs
> although he already opened /dev/zram but after a while he can see sysfs.
> It's weired.
>
> I want to fix it more generic way. Othewise, we might have trouble with
> locking problem sometime. We already have experieced it with init_lock
> although we finally fixed it.
>
> I think we can fix it with below patch I hope it's more general and right
> approach. It's based on your [zram: return zram device_id from zram_add()]
>
> What do you think about?
>
> From e943df5407b880f9262ef959b270226fdc81bc9f Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@...nel.org>
> Date: Mon, 4 May 2015 08:36:07 +0900
> Subject: [PATCH 1/2] zram: close race by open overriding
>
> [1] introduced bdev->bd_mutex to protect a race between mount
> and reset. At that time, we don't have dynamic zram-add/remove
> feature so it was okay.
>
> However, as we introduce dynamic device feature, bd_mutex became
> trouble.
>
> CPU 0
>
> echo 1 > /sys/block/zram<id>/reset
> -> kernfs->s_active(A)
> -> zram:reset_store->bd_mutex(B)
>
> CPU 1
>
> echo <id> > /sys/class/zram/zram-remove
> ->zram:zram_remove: bd_mutex(B)
> -> sysfs_remove_group
> -> kernfs->s_active(A)
>
> IOW, AB -> BA deadlock
>
> The reason we are holding bd_mutex for zram_remove is to prevent
> any incoming open /dev/zram[0-9]. Otherwise, we could remove zram
> others already have opened. But it causes above deadlock problem.
>
> To fix the problem, this patch overrides block_device.open and
> it returns -EBUSY if zram asserts he claims zram to reset so any
> incoming open will be failed so we don't need to hold bd_mutex
> for zram_remove ayn more.
>
> This patch is to prepare for zram-add/remove feature.
>
> [1] ba6b17: zram: fix umount-reset_store-mount race condition
> Signed-off-by: Minchan Kim <minchan@...nel.org>
If above has no problem, we could apply your last patch on top of it.
>From 5bfa8a2e312a9c8493f574b1cf513ef4693a465c Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Date: Mon, 4 May 2015 09:02:23 +0900
Subject: [PATCH 2/2] zram: add dynamic device add/remove functionality
We currently don't support on-demand device creation. The one and only way
to have N zram devices is to specify num_devices module parameter (default
value: 1). IOW if, for some reason, at some point, user wants to have
N + 1 devies he/she must umount all the existing devices, unload the
module, load the module passing num_devices equals to N + 1. And do this
again, if needed.
This patch introduces zram control sysfs class, which has two sysfs
attrs:
- zram_add -- add a new zram device
- zram_remove -- remove a specific (device_id) zram device
zram_add sysfs attr is read-only and has only automatic device id
assignment mode (as requested by Minchan Kim). read operation performed
on this attr creates a new zram device and returns back its device_id or
error status.
Usage example:
# add a new specific zram device
cat /sys/class/zram-control/zram_add
2
# remove a specific zram device
echo 4 > /sys/class/zram-control/zram_remove
Returning zram_add() error code back to user (-ENOMEM in this case)
cat /sys/class/zram-control/zram_add
cat: /sys/class/zram-control/zram_add: Cannot allocate memory
NOTE, there might be users who already depend on the fact that at least
zram0 device gets always created by zram_init(). Preserve this behavior.
[minchan]: use zram->claim to avoid lockdep splat
Reported-by: Minchan Kim <minchan@...nel.org>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
---
Documentation/blockdev/zram.txt | 23 ++++++++--
drivers/block/zram/zram_drv.c | 97 +++++++++++++++++++++++++++++++++++++++--
2 files changed, 114 insertions(+), 6 deletions(-)
diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
index 65e9430..fc686d4 100644
--- a/Documentation/blockdev/zram.txt
+++ b/Documentation/blockdev/zram.txt
@@ -99,7 +99,24 @@ size of the disk when not in use so a huge zram is wasteful.
mkfs.ext4 /dev/zram1
mount /dev/zram1 /tmp
-7) Stats:
+7) Add/remove zram devices
+
+zram provides a control interface, which enables dynamic (on-demand) device
+addition and removal.
+
+In order to add a new /dev/zramX device, perform read operation on zram_add
+attribute. This will return either new device's device id (meaning that you
+can use /dev/zram<id>) or error code.
+
+Example:
+ cat /sys/class/zram-control/zram_add
+ 1
+
+To remove the existing /dev/zramX device (where X is a device id)
+execute
+ echo X > /sys/class/zram-control/zram_remove
+
+8) Stats:
Per-device statistics are exported as various nodes under /sys/block/zram<id>/
A brief description of exported device attritbutes. For more details please
@@ -174,11 +191,11 @@ line of text and contains the following stats separated by whitespace:
zero_pages
num_migrated
-8) Deactivate:
+9) Deactivate:
swapoff /dev/zram0
umount /dev/zram1
-9) Reset:
+10) Reset:
Write any positive value to 'reset' sysfs node
echo 1 > /sys/block/zram0/reset
echo 1 > /sys/block/zram1/reset
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 7fb72dc..97cd4f3 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -29,10 +29,14 @@
#include <linux/vmalloc.h>
#include <linux/err.h>
#include <linux/idr.h>
+#include <linux/sysfs.h>
#include "zram_drv.h"
static DEFINE_IDR(zram_index_idr);
+/* idr index must be protected */
+static DEFINE_MUTEX(zram_index_mutex);
+
static int zram_major;
static const char *default_compressor = "lzo";
@@ -1278,24 +1282,101 @@ out_free_dev:
return ret;
}
-static void zram_remove(struct zram *zram)
+static int zram_remove(struct zram *zram)
{
- pr_info("Removed device: %s\n", zram->disk->disk_name);
+ struct block_device *bdev;
+
+ bdev = bdget_disk(zram->disk, 0);
+ if (!bdev)
+ return -ENOMEM;
+
+ mutex_lock(&bdev->bd_mutex);
+ if (bdev->bd_openers || zram->claim) {
+ mutex_unlock(&bdev->bd_mutex);
+ return -EBUSY;
+ }
+
+ zram->claim = true;
+ mutex_unlock(&bdev->bd_mutex);
+
/*
* Remove sysfs first, so no one will perform a disksize
- * store while we destroy the devices
+ * store while we destroy the devices. This also helps during
+ * zram_remove() -- device_reset() is the last holder of
+ * ->init_lock.
*/
sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
&zram_disk_attr_group);
+ /* Make sure all pending I/O is finished */
+ fsync_bdev(bdev);
zram_reset_device(zram);
+ mutex_unlock(&bdev->bd_mutex);
+
+ pr_info("Removed device: %s\n", zram->disk->disk_name);
+
idr_remove(&zram_index_idr, zram->disk->first_minor);
blk_cleanup_queue(zram->disk->queue);
del_gendisk(zram->disk);
put_disk(zram->disk);
kfree(zram);
+
+ return 0;
}
+/* zram module control sysfs attributes */
+static ssize_t zram_add_show(struct class *class,
+ struct class_attribute *attr,
+ char *buf)
+{
+ int ret;
+
+ mutex_lock(&zram_index_mutex);
+ ret = zram_add();
+ mutex_unlock(&zram_index_mutex);
+
+ if (ret < 0)
+ return ret;
+ return scnprintf(buf, PAGE_SIZE, "%d\n", ret);
+}
+
+static ssize_t zram_remove_store(struct class *class,
+ struct class_attribute *attr,
+ const char *buf,
+ size_t count)
+{
+ struct zram *zram;
+ int ret, dev_id;
+
+ /* dev_id is gendisk->first_minor, which is `int' */
+ ret = kstrtoint(buf, 10, &dev_id);
+ if (ret || dev_id < 0)
+ return -EINVAL;
+
+ mutex_lock(&zram_index_mutex);
+
+ zram = idr_find(&zram_index_idr, dev_id);
+ if (zram)
+ ret = zram_remove(zram);
+ else
+ ret = -ENODEV;
+
+ mutex_unlock(&zram_index_mutex);
+ return ret ? ret : count;
+}
+
+static struct class_attribute zram_control_class_attrs[] = {
+ __ATTR_RO(zram_add),
+ __ATTR_WO(zram_remove),
+ __ATTR_NULL,
+};
+
+static struct class zram_control_class = {
+ .name = "zram-control",
+ .owner = THIS_MODULE,
+ .class_attrs = zram_control_class_attrs,
+};
+
static int zram_remove_cb(int id, void *ptr, void *data)
{
zram_remove(ptr);
@@ -1304,6 +1385,7 @@ static int zram_remove_cb(int id, void *ptr, void *data)
static void destroy_devices(void)
{
+ class_unregister(&zram_control_class);
idr_for_each(&zram_index_idr, &zram_remove_cb, NULL);
idr_destroy(&zram_index_idr);
unregister_blkdev(zram_major, "zram");
@@ -1313,14 +1395,23 @@ static int __init zram_init(void)
{
int ret;
+ ret = class_register(&zram_control_class);
+ if (ret) {
+ pr_warn("Unable to register zram-control class\n");
+ return ret;
+ }
+
zram_major = register_blkdev(0, "zram");
if (zram_major <= 0) {
pr_warn("Unable to get major number\n");
+ class_unregister(&zram_control_class);
return -EBUSY;
}
while (num_devices != 0) {
+ mutex_lock(&zram_index_mutex);
ret = zram_add();
+ mutex_unlock(&zram_index_mutex);
if (ret < 0)
goto out_error;
num_devices--;
--
1.9.3
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists