lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200519155210.GU11244@42.do-not-panic.com>
Date:   Tue, 19 May 2020 15:52:10 +0000
From:   Luis Chamberlain <mcgrof@...nel.org>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     axboe@...nel.dk, viro@...iv.linux.org.uk, bvanassche@....org,
        rostedt@...dmis.org, mingo@...hat.com, jack@...e.cz,
        ming.lei@...hat.com, nstange@...e.de, akpm@...ux-foundation.org,
        mhocko@...e.com, yukuai3@...wei.com, linux-block@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Omar Sandoval <osandov@...com>,
        Hannes Reinecke <hare@...e.com>,
        Michal Hocko <mhocko@...nel.org>,
        syzbot+603294af2d01acfdd6da@...kaller.appspotmail.com
Subject: Re: [PATCH v5 5/7] blktrace: fix debugfs use after free

On Tue, May 19, 2020 at 04:44:08PM +0200, Greg KH wrote:
> On Sat, May 16, 2020 at 03:19:54AM +0000, Luis Chamberlain wrote:
> >  struct dentry *blk_debugfs_root;
> > +struct dentry *blk_debugfs_bsg = NULL;
> 
> checkpatch didn't complain about "= NULL;"?

Will remove.

> > +static void queue_debugfs_register_type(struct request_queue *q,
> > +					const char *name,
> > +					enum blk_debugfs_dir_type type)
> > +{
> > +	struct dentry *base_dir = queue_get_base_dir(type);
> 
> And it could be a simple if statement instead.
> 
> Oh well, I don't have to maintain this :)

I'll just use that, but yeah I think its a matter of preference.

> > +/**
> > + * blk_queue_debugfs_register - register the debugfs_dir for the block device
> > + * @q: the associated request_queue of the block device
> > + * @name: the name of the block device exposed
> > + *
> > + * This is used to create the debugfs_dir used by the block layer and blktrace.
> > + * Drivers which use any of the *add_disk*() calls or variants have this called
> > + * automatically for them. This directory is removed automatically on
> > + * blk_release_queue() once the request_queue reference count reaches 0.
> > + */
> > +void blk_queue_debugfs_register(struct request_queue *q, const char *name)
> > +{
> > +	queue_debugfs_register_type(q, name, BLK_DBG_DIR_BASE);
> > +}
> > +EXPORT_SYMBOL_GPL(blk_queue_debugfs_register);
> > +
> > +/**
> > + * blk_queue_debugfs_unregister - remove the debugfs_dir for the block device
> > + * @q: the associated request_queue of the block device
> > + *
> > + * Removes the debugfs_dir for the request_queue on the associated block device.
> > + * This is handled for you on blk_release_queue(), and that should only be
> > + * called once.
> > + *
> > + * Since we don't care where the debugfs_dir was created this is used for all
> > + * types of of enum blk_debugfs_dir_type.
> > + */
> > +void blk_queue_debugfs_unregister(struct request_queue *q)
> > +{
> > +	debugfs_remove_recursive(q->debugfs_dir);
> > +}
> 
> Why is register needed to be exported, but unregister does not?  Does
> some driver not properly clean things up?

Is the comment on blk_queue_debugfs_register() not sufficient?
I thought I was going overboard with how clear this was.  Should I also
add a note here on unregister?

> > +
> > +static struct dentry *queue_debugfs_symlink_type(struct request_queue *q,
> > +						 const char *src,
> > +						 const char *dst,
> > +						 enum blk_debugfs_dir_type type)
> > +{
> > +	struct dentry *dentry = ERR_PTR(-EINVAL);
> > +	char *dir_dst = NULL;
> > +
> > +	switch (type) {
> > +	case BLK_DBG_DIR_BASE:
> > +		if (dst)
> > +			dir_dst = kasprintf(GFP_KERNEL, "%s", dst);
> > +		else if (!IS_ERR_OR_NULL(q->debugfs_dir))
> > +			dir_dst = kasprintf(GFP_KERNEL, "%s",
> > +					    q->debugfs_dir->d_name.name);
> 
> There really is no other way to get the name of the directory other than
> from the dentry?  It's not in the queue itself somewhere?

Nope, beyond that, the problem is that the caller can be scsi-generic
and the queue name instantiation is opaque to what happens below, and
the name of that target directory is only set when the async probe
completes, much after the class_interface sg_add_device(). That is, the
request_queue is shared between scsi-generic device and another driver
which depends on the scsi driver type: TYPE_DISK, TYPE_TAPE, etc. The
sg_add_device() gets called before the debugfs_dir name is even determined
and set. This is why I punted setting the symlink to the ioctl on
scsi-generic.

If we add a post-probe class_interface callback, and make scsi-generic
use it, it would only allow us to set the symlink at a better time
during initialization after the async probe instead of the ioctl, then
if we give the class_interface the now probed device we *could* instead
use device_name().

I thought this would be a welcomed change, but I see this as an
evolution.  In particular older kernels will have to use this format,
unless they want to carry extensions to the class_interface as well.

> Anyway, not a big deal, just trying to not expose debugfs internals
> here.

I'm with you on this, I'd personally prefer to see an extension to the
class_interface as an evolution, that way these fixes can be backported
without much hassle, and the *need* for the new class_interface call is
clearer.

But I'll yield to what folks prefer here.

> > diff --git a/block/partitions/core.c b/block/partitions/core.c
> > index 297004fd2264..4d2a130e6055 100644
> > --- a/block/partitions/core.c
> > +++ b/block/partitions/core.c
> > @@ -10,6 +10,7 @@
> >  #include <linux/vmalloc.h>
> >  #include <linux/blktrace_api.h>
> >  #include <linux/raid/detect.h>
> > +#include <linux/debugfs.h>
> >  #include "check.h"
> >  
> >  static int (*check_part[])(struct parsed_partitions *) = {
> > @@ -320,6 +321,9 @@ void delete_partition(struct gendisk *disk, struct hd_struct *part)
> >  	 *  we have to hold the disk device
> >  	 */
> >  	get_device(disk_to_dev(part_to_disk(part)));
> > +#ifdef CONFIG_DEBUG_FS
> > +	debugfs_remove(part->debugfs_sym);
> > +#endif
> 
> Why is the #ifdef needed?  It shouldn't be.

Because debugfs_sym is a member which is only extended if
CONFIG_DEBUG_FS is defined.

> And why not recursive?

recursive seems odd for a symlink.

> >  	rcu_assign_pointer(ptbl->part[part->partno], NULL);
> >  	kobject_put(part->holder_dir);
> >  	device_del(part_to_dev(part));
> > @@ -460,6 +464,11 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
> >  	/* everything is up and running, commence */
> >  	rcu_assign_pointer(ptbl->part[partno], p);
> >  
> > +#ifdef CONFIG_DEBUG_FS
> > +	p->debugfs_sym = blk_queue_debugfs_symlink(disk->queue, dev_name(pdev),
> > +						   disk->disk_name);
> > +#endif
> 
> Again, no #ifdef should be needed here, just provide the "empty"
> function in the .h file.
> 
> You know this stuff :)

Well it was only *one* function, if we want the boiler plate stuff to
not deal with it, fine, I'll wrap it around and provide a helper for
these. It just seemed overkill.

> > @@ -1644,6 +1716,9 @@ sg_remove_device(struct device *cl_dev, struct class_interface *cl_intf)
> >  
> >  	sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
> >  	device_destroy(sg_sysfs_class, MKDEV(SCSI_GENERIC_MAJOR, sdp->index));
> > +#ifdef CONFIG_DEBUG_FS
> > +	debugfs_remove(sdp->debugfs_sym);
> > +#endif
> 
> Again, no need for the #ifdef.
> 
> If you are worried about the variable not being there, just always put
> it in the structure, it's only a pointer, for something that there are
> not a lot of, right?

Alright, will use wrappers.

  Luis

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ