lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.1012011117460.8183@cobra.newdream.net>
Date:	Wed, 1 Dec 2010 11:25:16 -0800 (PST)
From:	Sage Weil <sage@...dream.net>
To:	Greg KH <greg@...ah.com>
cc:	Yehuda Sadeh <yehuda@...newdream.net>, ceph-devel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rbd: replace the rbd sysfs interface

Hi Greg,

I'm sure you're busy and as tired of this thread as we are, but I think
it's close and we have (I hope) just one remaining question.  The current
patch (see below) gives us

 /sys/bus/rbd/{add,remove}
 /sys/bus/rbd/devices/<devid>/                 <-- struct device
 /sys/bus/rbd/devices/<devid>/{some dev attrs}
 /sys/bus/rbd/devices/<devid>/snap_<snapid>/   <-- struct device
 /sys/bus/rbd/devices/<devid>/snap_<snapid>/{some snap attrs}

This works, and I is (I hope) using struct device properly.  The only 
problem, purely from a user interface standpoint, is that the snaps are 
mixed in with attributes, so anybody wanting to iterate over snaps needs 
to do something crufty like

 $ for snap in `ls /sys/bus/rbd/devices/$id | grep ^snap_ | cut -c 6-`; do ...

Adding an intermediate snaps/ subdir would let them instead do

 $ for snap in `ls /sys/bus/rbd/devices/$id/snaps/`; do ...

without worrying about the (arbitrarily named) snaps from colliding with 
device attributes.  Assuming that is a preferable interface, is the 
"right" way to do that to make "snaps" a struct device?  Or is there a 
good reason why that is not preferable?

Thanks!
sage



On Tue, 23 Nov 2010, Yehuda Sadeh wrote:

> On Mon, 2010-11-22 at 16:58 -0800, Greg KH wrote:
> On Mon, Nov 22, 2010 at 04:48:54PM -0800, Yehuda Sadeh Weinraub wrote:
> > > The reason we keep snapshots in a separate subdirectory is that they
> > > can have arbitrary name. So either we prefix them and put them in a
> > > common namespace with the devices, or we put them down the hierarchy.
> > 
> > Do either one.  I would suggest a prefix.
> > 
> > > In any case we don't do any operations on them, we just have them for
> > > informational use and we put them there so that we don't have one big
> > > file that lists them all.
> > 
> > But something cares about them, so treat them properly.
> > 
> > > >> Another way would be to create a group for (2) under (1) and create a
> > > >> kobject for (3), for which you can create group per snapshot.
> > > >>
> > > >> Am I missing something? We already have the first solution (kobjects
> > > >> only) implemented, is there some real benefit for using the third
> > > >> method? We'll have to manually add remove groups anyway, as snapshots
> > > >> can be removed and new snapshots can be added.
> > > >
> > > > Never add kobjects to a struct device, that is showing you that
> > > > something is wrong, and that userspace really will want to get that
> > > > create/destroy event of the sub child.
> > > >
> > > 
> > > But they're there as information device attributes, it's nothing like
> > > partitions in block devices. So we want to just be able to list them
> > > and their attributes easily (and nicely), without having to put them
> > > in one big file.
> > 
> > Then use a prefix and put everything in the same subdirectory underneath
> > the id and you should be fine, right?
> > 
> 
> The following patch puts the snapshots as subdirs on the device directory.
> Each snapshot is in a different device structure, whereas its parent is
> the device. This works, however, not very pretty and/or not very
> convenient. Can we add another intermediate device that'll serve as a
> container for the snapshots?
> 
> Thanks,
> Yehuda
> 
> --
> 
> Yehuda Sadeh (1):
>   rbd: replace the rbd sysfs interface
> 
>  Documentation/ABI/testing/sysfs-bus-rbd |   84 ++++
>  drivers/block/rbd.c                     |  748 +++++++++++++++++++------------
>  2 files changed, 555 insertions(+), 277 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-rbd
> 
> 
> --
> 
> From: Yehuda Sadeh <yehuda@...newdream.net>
> Date: Fri, 19 Nov 2010 14:51:04 -0800
> Subject: [PATCH 1/1] rbd: replace the rbd sysfs interface
> 
> The new interface creates directories per mapped image
> and under each it creates a subdir per available snapshot.
> This allows keeping a cleaner interface within the sysfs
> guidelines. The ABI documentation was updated too.
> 
> Signed-off-by: Yehuda Sadeh <yehuda@...newdream.net>
> ---
>  Documentation/ABI/testing/sysfs-bus-rbd |   83 ++++
>  drivers/block/rbd.c                     |  748 +++++++++++++++++++------------
>  2 files changed, 554 insertions(+), 277 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-rbd b/Documentation/ABI/testing/sysfs-bus-rbd
> new file mode 100644
> index 0000000..90a87e2
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-rbd
> @@ -0,0 +1,83 @@
> +What:		/sys/bus/rbd/
> +Date:		November 2010
> +Contact:	Yehuda Sadeh <yehuda@...newdream.net>,
> +		Sage Weil <sage@...dream.net>
> +Description:
> +
> +Being used for adding and removing rbd block devices.
> +
> +Usage: <mon ip addr> <options> <pool name> <rbd image name> [snap name]
> +
> + $ echo "192.168.0.1 name=admin rbd foo" > /sys/bus/rbd/add
> +
> +The snapshot name can be "-" or omitted to map the image read/write. A <dev-id>
> +will be assigned for any registered block device. If snapshot is used, it will
> +be mapped read-only.
> +
> +Removal of a device:
> +
> +  $ echo <dev-id> > /sys/bus/rbd/remove
> +
> +Entries under /sys/bus/rbd/devices/<dev-id>/
> +--------------------------------------------
> +
> +client_id
> +
> +	The ceph unique client id that was assigned for this specific session.
> +
> +major
> +
> +	The block device major number.
> +
> +name
> +
> +	The name of the rbd image.
> +
> +pool
> +
> +	The pool where this rbd image resides. The pool-name pair is unique
> +	per rados system.
> +
> +size
> +
> +	The size (in bytes) of the mapped block device.
> +
> +refresh
> +
> +	Writing to this file will reread the image header data and set
> +	all relevant datastructures accordingly.
> +
> +current_snap
> +
> +	The current snapshot for which the device is mapped.
> +
> +create_snap
> +
> +	Create a snapshot:
> +
> +	 $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_create
> +
> +rollback_snap
> +
> +	Rolls back data to the specified snapshot. This goes over the entire
> +	list of rados blocks and sends a rollback command to each.
> +
> +	 $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_rollback
> +
> +snap_*
> +
> +	A directory per each snapshot
> +
> +
> +Entries under /sys/bus/rbd/devices/<dev-id>/snap_<snap-name>
> +-------------------------------------------------------------
> +
> +id
> +
> +	The rados internal snapshot id assigned for this snapshot
> +
> +size
> +
> +	The size of the image when this snapshot was taken.
> +
> +
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 6ec9d53..008d4a0 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -21,80 +21,9 @@
>  
>  
>  
> -   Instructions for use
> -   --------------------
> +   For usage instructions, please refer to:
>  
> -   1) Map a Linux block device to an existing rbd image.
> -
> -      Usage: <mon ip addr> <options> <pool name> <rbd image name> [snap name]
> -
> -      $ echo "192.168.0.1 name=admin rbd foo" > /sys/class/rbd/add
> -
> -      The snapshot name can be "-" or omitted to map the image read/write.
> -
> -   2) List all active blkdev<->object mappings.
> -
> -      In this example, we have performed step #1 twice, creating two blkdevs,
> -      mapped to two separate rados objects in the rados rbd pool
> -
> -      $ cat /sys/class/rbd/list
> -      #id     major   client_name     pool    name    snap    KB
> -      0       254     client4143      rbd     foo     -      1024000
> -
> -      The columns, in order, are:
> -      - blkdev unique id
> -      - blkdev assigned major
> -      - rados client id
> -      - rados pool name
> -      - rados block device name
> -      - mapped snapshot ("-" if none)
> -      - device size in KB
> -
> -
> -   3) Create a snapshot.
> -
> -      Usage: <blkdev id> <snapname>
> -
> -      $ echo "0 mysnap" > /sys/class/rbd/snap_create
> -
> -
> -   4) Listing a snapshot.
> -
> -      $ cat /sys/class/rbd/snaps_list
> -      #id     snap    KB
> -      0       -       1024000 (*)
> -      0       foo     1024000
> -
> -      The columns, in order, are:
> -      - blkdev unique id
> -      - snapshot name, '-' means none (active read/write version)
> -      - size of device at time of snapshot
> -      - the (*) indicates this is the active version
> -
> -   5) Rollback to snapshot.
> -
> -      Usage: <blkdev id> <snapname>
> -
> -      $ echo "0 mysnap" > /sys/class/rbd/snap_rollback
> -
> -
> -   6) Mapping an image using snapshot.
> -
> -      A snapshot mapping is read-only. This is being done by passing
> -      snap=<snapname> to the options when adding a device.
> -
> -      $ echo "192.168.0.1 name=admin,snap=mysnap rbd foo" > /sys/class/rbd/add
> -
> -
> -   7) Remove an active blkdev<->rbd image mapping.
> -
> -      In this example, we remove the mapping with blkdev unique id 1.
> -
> -      $ echo 1 > /sys/class/rbd/remove
> -
> -
> -   NOTE:  The actual creation and deletion of rados objects is outside the scope
> -   of this driver.
> +                 Documentation/ABI/testing/sysfs-bus-rbd
>  
>   */
>  
> @@ -163,6 +92,14 @@ struct rbd_request {
>  	u64			len;
>  };
>  
> +struct rbd_snap {
> +	struct	device		dev;
> +	const char		*name;
> +	size_t			size;
> +	struct list_head	node;
> +	u64			id;
> +};
> +
>  /*
>   * a single device
>   */
> @@ -193,21 +130,60 @@ struct rbd_device {
>  	int read_only;
>  
>  	struct list_head	node;
> +
> +	/* list of snapshots */
> +	struct list_head	snaps;
> +
> +	/* sysfs related */
> +	struct device		dev;
> +};
> +
> +static struct bus_type rbd_bus_type = {
> +	.name		= "rbd",
>  };
>  
>  static spinlock_t node_lock;      /* protects client get/put */
>  
> -static struct class *class_rbd;	  /* /sys/class/rbd */
>  static DEFINE_MUTEX(ctl_mutex);	  /* Serialize open/close/setup/teardown */
>  static LIST_HEAD(rbd_dev_list);    /* devices */
>  static LIST_HEAD(rbd_client_list);      /* clients */
>  
> +static int __rbd_init_snaps_header(struct rbd_device *rbd_dev);
> +static void rbd_dev_release(struct device *dev);
> +static ssize_t rbd_snap_rollback(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf,
> +				 size_t size);
> +static ssize_t rbd_snap_add(struct device *dev,
> +			    struct device_attribute *attr,
> +			    const char *buf,
> +			    size_t count);
> +static void __rbd_remove_snap_dev(struct rbd_device *rbd_dev,
> +				  struct rbd_snap *snap);;
> +
> +
> +static struct rbd_device *dev_to_rbd(struct device *dev)
> +{
> +	return container_of(dev, struct rbd_device, dev);
> +}
> +
> +static struct device *rbd_get_dev(struct rbd_device *rbd_dev)
> +{
> +	return get_device(&rbd_dev->dev);
> +}
> +
> +static void rbd_put_dev(struct rbd_device *rbd_dev)
> +{
> +	put_device(&rbd_dev->dev);
> +}
>  
>  static int rbd_open(struct block_device *bdev, fmode_t mode)
>  {
>  	struct gendisk *disk = bdev->bd_disk;
>  	struct rbd_device *rbd_dev = disk->private_data;
>  
> +	rbd_get_dev(rbd_dev);
> +
>  	set_device_ro(bdev, rbd_dev->read_only);
>  
>  	if ((mode & FMODE_WRITE) && rbd_dev->read_only)
> @@ -216,9 +192,19 @@ static int rbd_open(struct block_device *bdev, fmode_t mode)
>  	return 0;
>  }
>  
> +static int rbd_release(struct gendisk *disk, fmode_t mode)
> +{
> +	struct rbd_device *rbd_dev = disk->private_data;
> +
> +	rbd_put_dev(rbd_dev);
> +
> +	return 0;
> +}
> +
>  static const struct block_device_operations rbd_bd_ops = {
>  	.owner			= THIS_MODULE,
>  	.open			= rbd_open,
> +	.release		= rbd_release,
>  };
>  
>  /*
> @@ -361,7 +347,6 @@ static int rbd_header_from_disk(struct rbd_image_header *header,
>  	int ret = -ENOMEM;
>  
>  	init_rwsem(&header->snap_rwsem);
> -
>  	header->snap_names_len = le64_to_cpu(ondisk->snap_names_len);
>  	header->snapc = kmalloc(sizeof(struct ceph_snap_context) +
>  				snap_count *
> @@ -1256,10 +1241,20 @@ bad:
>  	return -ERANGE;
>  }
>  
> +static void __rbd_remove_all_snaps(struct rbd_device *rbd_dev)
> +{
> +	struct rbd_snap *snap;
> +
> +	while (!list_empty(&rbd_dev->snaps)) {
> +		snap = list_first_entry(&rbd_dev->snaps, struct rbd_snap, node);
> +		__rbd_remove_snap_dev(rbd_dev, snap);
> +	}
> +}
> +
>  /*
>   * only read the first part of the ondisk header, without the snaps info
>   */
> -static int rbd_update_snaps(struct rbd_device *rbd_dev)
> +static int __rbd_update_snaps(struct rbd_device *rbd_dev)
>  {
>  	int ret;
>  	struct rbd_image_header h;
> @@ -1280,12 +1275,15 @@ static int rbd_update_snaps(struct rbd_device *rbd_dev)
>  	rbd_dev->header.total_snaps = h.total_snaps;
>  	rbd_dev->header.snapc = h.snapc;
>  	rbd_dev->header.snap_names = h.snap_names;
> +	rbd_dev->header.snap_names_len = h.snap_names_len;
>  	rbd_dev->header.snap_sizes = h.snap_sizes;
>  	rbd_dev->header.snapc->seq = snap_seq;
>  
> +	ret = __rbd_init_snaps_header(rbd_dev);
> +
>  	up_write(&rbd_dev->header.snap_rwsem);
>  
> -	return 0;
> +	return ret;
>  }
>  
>  static int rbd_init_disk(struct rbd_device *rbd_dev)
> @@ -1300,6 +1298,11 @@ static int rbd_init_disk(struct rbd_device *rbd_dev)
>  	if (rc)
>  		return rc;
>  
> +	/* no need to lock here, as rbd_dev is not registered yet */
> +	rc = __rbd_init_snaps_header(rbd_dev);
> +	if (rc)
> +		return rc;
> +
>  	rc = rbd_header_set_snap(rbd_dev, rbd_dev->snap_name, &total_size);
>  	if (rc)
>  		return rc;
> @@ -1343,54 +1346,360 @@ out:
>  	return rc;
>  }
>  
> -/********************************************************************
> - * /sys/class/rbd/
> - *                   add	map rados objects to blkdev
> - *                   remove	unmap rados objects
> - *                   list	show mappings
> - *******************************************************************/
> +/*
> +  sysfs
> +*/
> +
> +static ssize_t rbd_size_show(struct device *dev,
> +			     struct device_attribute *attr, char *buf)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "%llu\n", (unsigned long long)rbd_dev->header.image_size);
> +}
> +
> +static ssize_t rbd_major_show(struct device *dev,
> +			      struct device_attribute *attr, char *buf)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
>  
> -static void class_rbd_release(struct class *cls)
> +	return sprintf(buf, "%d\n", rbd_dev->major);
> +}
> +
> +static ssize_t rbd_client_id_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf)
>  {
> -	kfree(cls);
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "client%lld\n", ceph_client_id(rbd_dev->client));
>  }
>  
> -static ssize_t class_rbd_list(struct class *c,
> -			      struct class_attribute *attr,
> -			      char *data)
> +static ssize_t rbd_pool_show(struct device *dev,
> +			     struct device_attribute *attr, char *buf)
>  {
> -	int n = 0;
> -	struct list_head *tmp;
> -	int max = PAGE_SIZE;
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "%s\n", rbd_dev->pool_name);
> +}
> +
> +static ssize_t rbd_name_show(struct device *dev,
> +			     struct device_attribute *attr, char *buf)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "%s\n", rbd_dev->obj);
> +}
> +
> +static ssize_t rbd_snap_show(struct device *dev,
> +			     struct device_attribute *attr,
> +			     char *buf)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "%s\n", rbd_dev->snap_name);
> +}
> +
> +static ssize_t rbd_image_refresh(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf,
> +				 size_t size)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +	int rc;
> +	int ret = size;
>  
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
>  
> -	n += snprintf(data, max,
> -		      "#id\tmajor\tclient_name\tpool\tname\tsnap\tKB\n");
> +	rc = __rbd_update_snaps(rbd_dev);
> +	if (rc < 0)
> +		ret = rc;
>  
> -	list_for_each(tmp, &rbd_dev_list) {
> -		struct rbd_device *rbd_dev;
> +	mutex_unlock(&ctl_mutex);
> +	return ret;
> +}
>  
> -		rbd_dev = list_entry(tmp, struct rbd_device, node);
> -		n += snprintf(data+n, max-n,
> -			      "%d\t%d\tclient%lld\t%s\t%s\t%s\t%lld\n",
> -			      rbd_dev->id,
> -			      rbd_dev->major,
> -			      ceph_client_id(rbd_dev->client),
> -			      rbd_dev->pool_name,
> -			      rbd_dev->obj, rbd_dev->snap_name,
> -			      rbd_dev->header.image_size >> 10);
> -		if (n == max)
> +static DEVICE_ATTR(size, S_IRUGO, rbd_size_show, NULL);
> +static DEVICE_ATTR(major, S_IRUGO, rbd_major_show, NULL);
> +static DEVICE_ATTR(client_id, S_IRUGO, rbd_client_id_show, NULL);
> +static DEVICE_ATTR(pool, S_IRUGO, rbd_pool_show, NULL);
> +static DEVICE_ATTR(name, S_IRUGO, rbd_name_show, NULL);
> +static DEVICE_ATTR(refresh, S_IWUSR, NULL, rbd_image_refresh);
> +static DEVICE_ATTR(current_snap, S_IRUGO, rbd_snap_show, NULL);
> +static DEVICE_ATTR(create_snap, S_IWUSR, NULL, rbd_snap_add);
> +static DEVICE_ATTR(rollback_snap, S_IWUSR, NULL, rbd_snap_rollback);
> +
> +static struct attribute *rbd_attrs[] = {
> +	&dev_attr_size.attr,
> +	&dev_attr_major.attr,
> +	&dev_attr_client_id.attr,
> +	&dev_attr_pool.attr,
> +	&dev_attr_name.attr,
> +	&dev_attr_current_snap.attr,
> +	&dev_attr_refresh.attr,
> +	&dev_attr_create_snap.attr,
> +	&dev_attr_rollback_snap.attr,
> +	NULL
> +};
> +
> +static struct attribute_group rbd_attr_group = {
> +	.attrs = rbd_attrs,
> +};
> +
> +static const struct attribute_group *rbd_attr_groups[] = {
> +	&rbd_attr_group,
> +	NULL
> +};
> +
> +static void rbd_sysfs_dev_release(struct device *dev)
> +{
> +}
> +
> +static struct device_type rbd_device_type = {
> +	.name		= "rbd",
> +	.groups		= rbd_attr_groups,
> +	.release	= rbd_sysfs_dev_release,
> +};
> +
> +
> +/*
> +  sysfs - snapshots
> +*/
> +
> +static ssize_t rbd_snap_size_show(struct device *dev,
> +				  struct device_attribute *attr,
> +				  char *buf)
> +{
> +	struct rbd_snap *snap = container_of(dev, struct rbd_snap, dev);
> +
> +	return sprintf(buf, "%lld\n", (long long)snap->size);
> +}
> +
> +static ssize_t rbd_snap_id_show(struct device *dev,
> +				struct device_attribute *attr,
> +				char *buf)
> +{
> +	struct rbd_snap *snap = container_of(dev, struct rbd_snap, dev);
> +
> +	return sprintf(buf, "%lld\n", (long long)snap->id);
> +}
> +
> +static DEVICE_ATTR(snap_size, S_IRUGO, rbd_snap_size_show, NULL);
> +static DEVICE_ATTR(snap_id, S_IRUGO, rbd_snap_id_show, NULL);
> +
> +static struct attribute *rbd_snap_attrs[] = {
> +	&dev_attr_snap_size.attr,
> +	&dev_attr_snap_id.attr,
> +	NULL,
> +};
> +
> +static struct attribute_group rbd_snap_attr_group = {
> +	.attrs = rbd_snap_attrs,
> +};
> +
> +static void rbd_snap_dev_release(struct device *dev)
> +{
> +	struct rbd_snap *snap = container_of(dev, struct rbd_snap, dev);
> +	kfree(snap->name);
> +	kfree(snap);
> +}
> +
> +static const struct attribute_group *rbd_snap_attr_groups[] = {
> +	&rbd_snap_attr_group,
> +	NULL
> +};
> +
> +static struct device_type rbd_snap_device_type = {
> +	.groups		= rbd_snap_attr_groups,
> +	.release	= rbd_snap_dev_release,
> +};
> +
> +static void __rbd_remove_snap_dev(struct rbd_device *rbd_dev,
> +				  struct rbd_snap *snap)
> +{
> +	list_del(&snap->node);
> +	device_unregister(&snap->dev);
> +}
> +
> +static int rbd_register_snap_dev(struct rbd_device *rbd_dev,
> +				  struct rbd_snap *snap,
> +				  struct device *parent)
> +{
> +	struct device *dev = &snap->dev;
> +	int ret;
> +
> +	dev->type = &rbd_snap_device_type;
> +	dev->parent = parent;
> +	dev->release = rbd_snap_dev_release;
> +	dev_set_name(dev, "snap_%s", snap->name);
> +	ret = device_register(dev);
> +
> +	return ret;
> +}
> +
> +static int __rbd_add_snap_dev(struct rbd_device *rbd_dev,
> +			      int i, const char *name,
> +			      struct rbd_snap **snapp)
> +{
> +	int ret;
> +	struct rbd_snap *snap = kzalloc(sizeof(*snap), GFP_KERNEL);
> +	if (!snap)
> +		return -ENOMEM;
> +	snap->name = kstrdup(name, GFP_KERNEL);
> +	snap->size = rbd_dev->header.snap_sizes[i];
> +	snap->id = rbd_dev->header.snapc->snaps[i];
> +	if (device_is_registered(&rbd_dev->dev)) {
> +		ret = rbd_register_snap_dev(rbd_dev, snap,
> +					     &rbd_dev->dev);
> +		if (ret < 0)
> +			goto err;
> +	}
> +	*snapp = snap;
> +	return 0;
> +err:
> +	kfree(snap->name);
> +	kfree(snap);
> +	return ret;
> +}
> +
> +/*
> + * search for the previous snap in a null delimited string list
> + */
> +const char *rbd_prev_snap_name(const char *name, const char *start)
> +{
> +	if (name < start + 2)
> +		return NULL;
> +
> +	name -= 2;
> +	while (*name) {
> +		if (name == start)
> +			return start;
> +		name--;
> +	}
> +	return name + 1;
> +}
> +
> +/*
> + * compare the old list of snapshots that we have to what's in the header
> + * and update it accordingly. Note that the header holds the snapshots
> + * in a reverse order (from newest to oldest) and we need to go from
> + * older to new so that we don't get a duplicate snap name when
> + * doing the process (e.g., removed snapshot and recreated a new
> + * one with the same name.
> + */
> +static int __rbd_init_snaps_header(struct rbd_device *rbd_dev)
> +{
> +	const char *name, *first_name;
> +	int i = rbd_dev->header.total_snaps;
> +	struct rbd_snap *snap, *old_snap = NULL;
> +	int ret;
> +	struct list_head *p, *n;
> +
> +	first_name = rbd_dev->header.snap_names;
> +	name = first_name + rbd_dev->header.snap_names_len;
> +
> +	list_for_each_prev_safe(p, n, &rbd_dev->snaps) {
> +		u64 cur_id;
> +
> +		old_snap = list_entry(p, struct rbd_snap, node);
> +
> +		if (i)
> +			cur_id = rbd_dev->header.snapc->snaps[i - 1];
> +
> +		if (!i || old_snap->id < cur_id) {
> +			/* old_snap->id was skipped, thus was removed */
> +			__rbd_remove_snap_dev(rbd_dev, old_snap);
> +			continue;
> +		}
> +		if (old_snap->id == cur_id) {
> +			/* we have this snapshot already */
> +			i--;
> +			name = rbd_prev_snap_name(name, first_name);
> +			continue;
> +		}
> +		for (; i > 0;
> +		     i--, name = rbd_prev_snap_name(name, first_name)) {
> +			if (!name) {
> +				WARN_ON(1);
> +				return -EINVAL;
> +			}
> +			cur_id = rbd_dev->header.snapc->snaps[i];
> +			/* snapshot removal? handle it above */
> +			if (cur_id >= old_snap->id)
> +				break;
> +			/* a new snapshot */
> +			ret = __rbd_add_snap_dev(rbd_dev, i - 1, name, &snap);
> +			if (ret < 0)
> +				return ret;
> +
> +			/* note that we add it backward so using n and not p */
> +			list_add(&snap->node, n);
> +			p = &snap->node;
> +		}
> +	}
> +	/* we're done going over the old snap list, just add what's left */
> +	for (; i > 0; i--) {
> +		name = rbd_prev_snap_name(name, first_name);
> +		if (!name) {
> +			WARN_ON(1);
> +			return -EINVAL;
> +		}
> +		ret = __rbd_add_snap_dev(rbd_dev, i - 1, name, &snap);
> +		if (ret < 0)
> +			return ret;
> +		list_add(&snap->node, &rbd_dev->snaps);
> +	}
> +
> +	return 0;
> +}
> +
> +
> +static void rbd_root_dev_release(struct device *dev)
> +{
> +}
> +
> +static struct device rbd_root_dev = {
> +	.init_name =    "rbd",
> +	.release =      rbd_root_dev_release,
> +};
> +
> +static int rbd_bus_add_dev(struct rbd_device *rbd_dev)
> +{
> +	int ret = -ENOMEM;
> +	struct device *dev;
> +	struct rbd_snap *snap;
> +
> +	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> +	dev = &rbd_dev->dev;
> +
> +	dev->bus = &rbd_bus_type;
> +	dev->type = &rbd_device_type;
> +	dev->parent = &rbd_root_dev;
> +	dev->release = rbd_dev_release;
> +	dev_set_name(dev, "%d", rbd_dev->id);
> +	ret = device_register(dev);
> +	if (ret < 0)
> +		goto done_free;
> +
> +	list_for_each_entry(snap, &rbd_dev->snaps, node) {
> +		ret = rbd_register_snap_dev(rbd_dev, snap,
> +					     &rbd_dev->dev);
> +		if (ret < 0)
>  			break;
>  	}
>  
>  	mutex_unlock(&ctl_mutex);
> -	return n;
> +	return 0;
> +done_free:
> +	mutex_unlock(&ctl_mutex);
> +	return ret;
>  }
>  
> -static ssize_t class_rbd_add(struct class *c,
> -			     struct class_attribute *attr,
> -			     const char *buf, size_t count)
> +static void rbd_bus_del_dev(struct rbd_device *rbd_dev)
> +{
> +	device_unregister(&rbd_dev->dev);
> +}
> +
> +static ssize_t rbd_add(struct bus_type *bus, const char *buf, size_t count)
>  {
>  	struct ceph_osd_client *osdc;
>  	struct rbd_device *rbd_dev;
> @@ -1419,6 +1728,7 @@ static ssize_t class_rbd_add(struct class *c,
>  	/* static rbd_device initialization */
>  	spin_lock_init(&rbd_dev->lock);
>  	INIT_LIST_HEAD(&rbd_dev->node);
> +	INIT_LIST_HEAD(&rbd_dev->snaps);
>  
>  	/* generate unique id: find highest unique id, add one */
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> @@ -1478,6 +1788,9 @@ static ssize_t class_rbd_add(struct class *c,
>  	}
>  	rbd_dev->major = irc;
>  
> +	rc = rbd_bus_add_dev(rbd_dev);
> +	if (rc)
> +		goto err_out_disk;
>  	/* set up and announce blkdev mapping */
>  	rc = rbd_init_disk(rbd_dev);
>  	if (rc)
> @@ -1487,6 +1800,8 @@ static ssize_t class_rbd_add(struct class *c,
>  
>  err_out_blkdev:
>  	unregister_blkdev(rbd_dev->major, rbd_dev->name);
> +err_out_disk:
> +	rbd_free_disk(rbd_dev);
>  err_out_client:
>  	rbd_put_client(rbd_dev);
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> @@ -1518,35 +1833,10 @@ static struct rbd_device *__rbd_get_dev(unsigned long id)
>  	return NULL;
>  }
>  
> -static ssize_t class_rbd_remove(struct class *c,
> -				struct class_attribute *attr,
> -				const char *buf,
> -				size_t count)
> +static void rbd_dev_release(struct device *dev)
>  {
> -	struct rbd_device *rbd_dev = NULL;
> -	int target_id, rc;
> -	unsigned long ul;
> -
> -	rc = strict_strtoul(buf, 10, &ul);
> -	if (rc)
> -		return rc;
> -
> -	/* convert to int; abort if we lost anything in the conversion */
> -	target_id = (int) ul;
> -	if (target_id != ul)
> -		return -EINVAL;
> -
> -	/* remove object from list immediately */
> -	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> -
> -	rbd_dev = __rbd_get_dev(target_id);
> -	if (rbd_dev)
> -		list_del_init(&rbd_dev->node);
> -
> -	mutex_unlock(&ctl_mutex);
> -
> -	if (!rbd_dev)
> -		return -ENOENT;
> +	struct rbd_device *rbd_dev =
> +			container_of(dev, struct rbd_device, dev);
>  
>  	rbd_put_client(rbd_dev);
>  
> @@ -1557,67 +1847,11 @@ static ssize_t class_rbd_remove(struct class *c,
>  
>  	/* release module ref */
>  	module_put(THIS_MODULE);
> -
> -	return count;
>  }
>  
> -static ssize_t class_rbd_snaps_list(struct class *c,
> -			      struct class_attribute *attr,
> -			      char *data)
> -{
> -	struct rbd_device *rbd_dev = NULL;
> -	struct list_head *tmp;
> -	struct rbd_image_header *header;
> -	int i, n = 0, max = PAGE_SIZE;
> -	int ret;
> -
> -	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> -
> -	n += snprintf(data, max, "#id\tsnap\tKB\n");
> -
> -	list_for_each(tmp, &rbd_dev_list) {
> -		char *names, *p;
> -		struct ceph_snap_context *snapc;
> -
> -		rbd_dev = list_entry(tmp, struct rbd_device, node);
> -		header = &rbd_dev->header;
> -
> -		down_read(&header->snap_rwsem);
> -
> -		names = header->snap_names;
> -		snapc = header->snapc;
> -
> -		n += snprintf(data + n, max - n, "%d\t%s\t%lld%s\n",
> -			      rbd_dev->id, RBD_SNAP_HEAD_NAME,
> -			      header->image_size >> 10,
> -			      (!rbd_dev->cur_snap ? " (*)" : ""));
> -		if (n == max)
> -			break;
> -
> -		p = names;
> -		for (i = 0; i < header->total_snaps; i++, p += strlen(p) + 1) {
> -			n += snprintf(data + n, max - n, "%d\t%s\t%lld%s\n",
> -			      rbd_dev->id, p, header->snap_sizes[i] >> 10,
> -			      (rbd_dev->cur_snap &&
> -			       (snap_index(header, i) == rbd_dev->cur_snap) ?
> -			       " (*)" : ""));
> -			if (n == max)
> -				break;
> -		}
> -
> -		up_read(&header->snap_rwsem);
> -	}
> -
> -
> -	ret = n;
> -	mutex_unlock(&ctl_mutex);
> -	return ret;
> -}
> -
> -static ssize_t class_rbd_snaps_refresh(struct class *c,
> -				struct class_attribute *attr,
> -				const char *buf,
> -				size_t count)
> +static ssize_t rbd_remove(struct bus_type *bus,
> +			  const char *buf,
> +			  size_t count)
>  {
>  	struct rbd_device *rbd_dev = NULL;
>  	int target_id, rc;
> @@ -1641,95 +1875,70 @@ static ssize_t class_rbd_snaps_refresh(struct class *c,
>  		goto done;
>  	}
>  
> -	rc = rbd_update_snaps(rbd_dev);
> -	if (rc < 0)
> -		ret = rc;
> +	list_del_init(&rbd_dev->node);
> +
> +	__rbd_remove_all_snaps(rbd_dev);
> +	rbd_bus_del_dev(rbd_dev);
>  
>  done:
>  	mutex_unlock(&ctl_mutex);
>  	return ret;
>  }
>  
> -static ssize_t class_rbd_snap_create(struct class *c,
> -				struct class_attribute *attr,
> -				const char *buf,
> -				size_t count)
> +static ssize_t rbd_snap_add(struct device *dev,
> +			    struct device_attribute *attr,
> +			    const char *buf,
> +			    size_t count)
>  {
> -	struct rbd_device *rbd_dev = NULL;
> -	int target_id, ret;
> -	char *name;
> -
> -	name = kmalloc(RBD_MAX_SNAP_NAME_LEN + 1, GFP_KERNEL);
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +	int ret;
> +	char *name = kmalloc(count + 1, GFP_KERNEL);
>  	if (!name)
>  		return -ENOMEM;
>  
> -	/* parse snaps add command */
> -	if (sscanf(buf, "%d "
> -		   "%" __stringify(RBD_MAX_SNAP_NAME_LEN) "s",
> -		   &target_id,
> -		   name) != 2) {
> -		ret = -EINVAL;
> -		goto done;
> -	}
> +	snprintf(name, count, "%s", buf);
>  
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
>  
> -	rbd_dev = __rbd_get_dev(target_id);
> -	if (!rbd_dev) {
> -		ret = -ENOENT;
> -		goto done_unlock;
> -	}
> -
>  	ret = rbd_header_add_snap(rbd_dev,
>  				  name, GFP_KERNEL);
>  	if (ret < 0)
>  		goto done_unlock;
>  
> -	ret = rbd_update_snaps(rbd_dev);
> +	ret = __rbd_update_snaps(rbd_dev);
>  	if (ret < 0)
>  		goto done_unlock;
>  
>  	ret = count;
>  done_unlock:
>  	mutex_unlock(&ctl_mutex);
> -done:
>  	kfree(name);
>  	return ret;
>  }
>  
> -static ssize_t class_rbd_rollback(struct class *c,
> -				struct class_attribute *attr,
> -				const char *buf,
> -				size_t count)
> +static ssize_t rbd_snap_rollback(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf,
> +				 size_t count)
>  {
> -	struct rbd_device *rbd_dev = NULL;
> -	int target_id, ret;
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +	int ret;
>  	u64 snapid;
> -	char snap_name[RBD_MAX_SNAP_NAME_LEN];
>  	u64 cur_ofs;
> -	char *seg_name;
> +	char *seg_name = NULL;
> +	char *snap_name = kmalloc(count + 1, GFP_KERNEL);
> +	ret = -ENOMEM;
> +	if (!snap_name)
> +		return ret;
>  
>  	/* parse snaps add command */
> -	if (sscanf(buf, "%d "
> -		   "%" __stringify(RBD_MAX_SNAP_NAME_LEN) "s",
> -		   &target_id,
> -		   snap_name) != 2) {
> -		return -EINVAL;
> -	}
> -
> -	ret = -ENOMEM;
> +	snprintf(snap_name, count, "%s", buf);
>  	seg_name = kmalloc(RBD_MAX_SEG_NAME_LEN + 1, GFP_NOIO);
>  	if (!seg_name)
> -		return ret;
> +		goto done;
>  
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
>  
> -	rbd_dev = __rbd_get_dev(target_id);
> -	if (!rbd_dev) {
> -		ret = -ENOENT;
> -		goto done_unlock;
> -	}
> -
>  	ret = snap_by_name(&rbd_dev->header, snap_name, &snapid, NULL);
>  	if (ret < 0)
>  		goto done_unlock;
> @@ -1750,7 +1959,7 @@ static ssize_t class_rbd_rollback(struct class *c,
>  				   seg_name, ret);
>  	}
>  
> -	ret = rbd_update_snaps(rbd_dev);
> +	ret = __rbd_update_snaps(rbd_dev);
>  	if (ret < 0)
>  		goto done_unlock;
>  
> @@ -1758,57 +1967,42 @@ static ssize_t class_rbd_rollback(struct class *c,
>  
>  done_unlock:
>  	mutex_unlock(&ctl_mutex);
> +done:
>  	kfree(seg_name);
> +	kfree(snap_name);
>  
>  	return ret;
>  }
>  
> -static struct class_attribute class_rbd_attrs[] = {
> -	__ATTR(add,		0200, NULL, class_rbd_add),
> -	__ATTR(remove,		0200, NULL, class_rbd_remove),
> -	__ATTR(list,		0444, class_rbd_list, NULL),
> -	__ATTR(snaps_refresh,	0200, NULL, class_rbd_snaps_refresh),
> -	__ATTR(snap_create,	0200, NULL, class_rbd_snap_create),
> -	__ATTR(snaps_list,	0444, class_rbd_snaps_list, NULL),
> -	__ATTR(snap_rollback,	0200, NULL, class_rbd_rollback),
> +static struct bus_attribute rbd_bus_attrs[] = {
> +	__ATTR(add, S_IWUSR, NULL, rbd_add),
> +	__ATTR(remove, S_IWUSR, NULL, rbd_remove),
>  	__ATTR_NULL
>  };
>  
>  /*
>   * create control files in sysfs
> - * /sys/class/rbd/...
> + * /sys/bus/rbd/...
>   */
>  static int rbd_sysfs_init(void)
>  {
> -	int ret = -ENOMEM;
> +	int ret;
>  
> -	class_rbd = kzalloc(sizeof(*class_rbd), GFP_KERNEL);
> -	if (!class_rbd)
> -		goto out;
> +	rbd_bus_type.bus_attrs = rbd_bus_attrs;
>  
> -	class_rbd->name = DRV_NAME;
> -	class_rbd->owner = THIS_MODULE;
> -	class_rbd->class_release = class_rbd_release;
> -	class_rbd->class_attrs = class_rbd_attrs;
> +	ret = bus_register(&rbd_bus_type);
> +	 if (ret < 0)
> +		return ret;
>  
> -	ret = class_register(class_rbd);
> -	if (ret)
> -		goto out_class;
> -	return 0;
> +	ret = device_register(&rbd_root_dev);
>  
> -out_class:
> -	kfree(class_rbd);
> -	class_rbd = NULL;
> -	pr_err(DRV_NAME ": failed to create class rbd\n");
> -out:
>  	return ret;
>  }
>  
>  static void rbd_sysfs_cleanup(void)
>  {
> -	if (class_rbd)
> -		class_destroy(class_rbd);
> -	class_rbd = NULL;
> +	device_unregister(&rbd_root_dev);
> +	bus_unregister(&rbd_bus_type);
>  }
>  
>  int __init rbd_init(void)
> -- 
> 1.5.6.5
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ