linux-kernel - Re: [PATCH v2 03/10] blktrace: fix debugfs use after free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200422074802.GS11244@42.do-not-panic.com>
Date:   Wed, 22 Apr 2020 07:48:02 +0000
From:   Luis Chamberlain <mcgrof@...nel.org>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     axboe@...nel.dk, viro@...iv.linux.org.uk, bvanassche@....org,
        gregkh@...uxfoundation.org, rostedt@...dmis.org, mingo@...hat.com,
        jack@...e.cz, ming.lei@...hat.com, nstange@...e.de,
        akpm@...ux-foundation.org, mhocko@...e.com, yukuai3@...wei.com,
        linux-block@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Omar Sandoval <osandov@...com>,
        Hannes Reinecke <hare@...e.com>,
        Michal Hocko <mhocko@...nel.org>,
        syzbot+603294af2d01acfdd6da@...kaller.appspotmail.com
Subject: Re: [PATCH v2 03/10] blktrace: fix debugfs use after free

On Wed, Apr 22, 2020 at 12:27:15AM -0700, Christoph Hellwig wrote:
> On Sun, Apr 19, 2020 at 07:45:22PM +0000, Luis Chamberlain wrote:
> > +{
> > +	struct dentry *dir = NULL;
> > +
> > +	/* This can happen if we have a bug in the lower layers */
> > +	dir = debugfs_lookup(kobject_name(q->kobj.parent), blk_debugfs_root);
> > +	if (dir) {
> > +		pr_warn("%s: registering request_queue debugfs directory twice is not allowed\n",
> > +			kobject_name(q->kobj.parent));
> > +		dput(dir);
> > +		return -EALREADY;
> > +	}
> 
> I don't see why we need this check.  If it is valueable enough we
> should have a debugfs_create_dir_exclusive or so that retunrns an error
> for an exsting directory, instead of reimplementing it in the caller in
> a racy way.  But I'm not really sure we need it to start with.

In short races, and even with synchronous request_queue removal I'm
seeing the race is still possible, but that's due to some other races
I'm going to chase down now.

The easier solution really is to just have a debugfs dir created for
each partition if debugfs is enabled, this way the directory will
always be there, and the lookups are gone.

> > +
> > +	q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent),
> > +					    blk_debugfs_root);
> > +	if (!q->debugfs_dir)
> > +		return -ENOMEM;
> > +
> > +	return 0;
> > +}
> > +
> > +void blk_queue_debugfs_unregister(struct request_queue *q)
> > +{
> > +	debugfs_remove_recursive(q->debugfs_dir);
> > +	q->debugfs_dir = NULL;
> > +}
> 
> Which to me suggests we can just fold these two into the callers,
> with an IS_ENABLED for the creation case given that we check for errors
> and the stub will always return an error.

Sorry not sure I follow this.

> >  	debugfs_create_files(q->debugfs_dir, q, blk_mq_debugfs_queue_attrs);
> >  
> >  	/*
> > @@ -856,9 +853,7 @@ void blk_mq_debugfs_register(struct request_queue *q)
> >  
> >  void blk_mq_debugfs_unregister(struct request_queue *q)
> >  {
> > -	debugfs_remove_recursive(q->debugfs_dir);
> >  	q->sched_debugfs_dir = NULL;
> > -	q->debugfs_dir = NULL;
> >  }
> 
> This function is weird - the sched dir gets removed by the
> debugfs_remove_recursive, so just leaving a function that clears
> a pointer is rather odd.  In fact I don't think we need to clear
> either sched_debugfs_dir or debugfs_dir anywhere.

Indeed. Will clean it up.

> > @@ -975,6 +976,14 @@ int blk_register_queue(struct gendisk *disk)
> >  		goto unlock;
> >  	}
> >  
> > +	ret = blk_queue_debugfs_register(q);
> > +	if (ret) {
> > +		blk_trace_remove_sysfs(dev);
> > +		kobject_del(&q->kobj);
> > +		kobject_put(&dev->kobj);
> > +		goto unlock;
> > +	}
> > +
> 
> Please use a goto label to consolidate the common cleanup code.

Sure.

> Also I think these generic debugfs changes probably should be separate
> to the blktrace changes.

I'll try to do that.

> >  static struct dentry *blk_trace_debugfs_dir(struct blk_user_trace_setup *buts,
> > +					    struct request_queue *q,
> >  					    struct blk_trace *bt)
> >  {
> >  	struct dentry *dir = NULL;
> >  
> > +	/* This can only happen if we have a bug on our lower layers */
> > +	if (!q->kobj.parent) {
> > +		pr_warn("%s: request_queue parent is gone\n", buts->name);
> > +		return NULL;
> > +	}
> 
> Why is this not simply a WARN_ON_ONCE()?

I'll actually remove it and instead fix the race where it happens.

> > +	if (blk_trace_target_disk(buts->name, kobject_name(q->kobj.parent))) {
> > +		if (!q->debugfs_dir) {
> > +			pr_warn("%s: expected request_queue debugfs_dir is not set\n",
> > +				buts->name);
> > +			return NULL;
> > +		}
> > +		/*
> > +		 * debugfs_lookup() is used to ensure the directory is not
> > +		 * taken from underneath us. We must dput() it later once
> > +		 * done with it within blktrace.
> > +		 */
> > +		dir = debugfs_lookup(buts->name, blk_debugfs_root);
> > +		if (!dir) {
> > +			pr_warn("%s: expected request_queue debugfs_dir dentry is gone\n",
> > +				buts->name);
> > +			return NULL;
> > +		}
> > +		 /*
> > +		 * This is a reaffirmation that debugfs_lookup() shall always
> > +		 * return the same dentry if it was already set.
> > +		 */
> > +		if (dir != q->debugfs_dir) {
> > +			dput(dir);
> > +			pr_warn("%s: expected dentry dir != q->debugfs_dir\n",
> > +				buts->name);
> > +			return NULL;
> > +		}
> > +		bt->backing_dir = q->debugfs_dir;
> > +		return bt->backing_dir;
> > +	}
> 
> Even with the gigantic commit log I don't get the point of this
> code.  It looks rather sketchy and I can't find a rationale for it.

Yeah I think this is going to be much easier on the eyes with the
revert to synchronous request_queue removal first.

  Luis