linux-kernel - Re: [PATCH] bsg: call idr_pre

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1284692130.26423.53.camel@mulgrave.site>
Date:	Thu, 16 Sep 2010 22:55:30 -0400
From:	James Bottomley <James.Bottomley@...e.de>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Naohiro Aota <naota@...sp.net>,
	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	linux-kernel@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
	Jiri Kosina <jkosina@...e.cz>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	Daniel Mack <daniel@...aq.de>, linux-scsi@...r.kernel.org
Subject: Re: [PATCH] bsg: call idr_pre_get() outside the lock.

On Thu, 2010-09-16 at 16:37 -0700, Andrew Morton wrote:
> On Wed, 01 Sep 2010 23:26:01 +0900
> Naohiro Aota <naota@...sp.net> wrote:
> 
> > The idr_pre_get() kernel-doc says "This function should be called
> > prior to locking and calling the idr_get_new* functions.", but the
> > idr_pre_get() calling in bsg_register_queue() is put inside
> > mutex_lock(). Let's fix it.
> > 
> 
> The idr_pre_get() kerneldoc is wrong.  Or at least, misleading.
> 
> The way this all works is that we precharge the tree via idr_pre_get()
> and we do this outside locks so we can use GFP_KERNEL.  Then we take
> the lock (a spinlock!) and then try to add an element to the tree,
> which will consume objects from the pool which was prefilled by
> idr_pre_get().
> 
> There's an obvious race here where someone else can get in and steal
> objects from the prefilled pool.  We can fix that with a retry loop:
> 
> 
> again:
> 	if (idr_pre_get(..., GFP_KERNEL) == NULL)
> 		return -ENOMEM;		/* We're really out-of-memory */
> 	spin_lock(lock);
> 	if (idr_get_new(...) == -EAGAIN) {
> 		spin_unlock(lock);
> 		goto again;		/* Someone stole our preallocation! */
> 	}
> 	...
> 
> this way we avoid the false -ENOMEM which the race would have caused. 
> We only declare -ENOMEM when we're REALLY out of memory.
> 
> 
> But none of this is needed when a sleeping lock is used (as long as the
> sleeping lock isn't taken on the the VM pageout path, of course).  In
> that case we can use the sleeping lock to prevent the above race.
> 
> > diff --git a/block/bsg.c b/block/bsg.c
> > index 82d5882..5fd8dd1 100644
> > --- a/block/bsg.c
> > +++ b/block/bsg.c
> > @@ -1010,13 +1010,11 @@ int bsg_register_queue(struct request_queue *q, struct device *parent,
> >  	bcd = &q->bsg_dev;
> >  	memset(bcd, 0, sizeof(*bcd));
> >  
> > -	mutex_lock(&bsg_mutex);
> > -
> >  	ret = idr_pre_get(&bsg_minor_idr, GFP_KERNEL);
> > -	if (!ret) {
> > -		ret = -ENOMEM;
> > -		goto unlock;
> > -	}
> > +	if (!ret)
> > +		return -ENOMEM;
> > +
> > +	mutex_lock(&bsg_mutex);
> >  
> >  	ret = idr_get_new(&bsg_minor_idr, bcd, &minor);
> >  	if (ret < 0)
> 
> So the old code was OK.
> 
> The new code, however, is not OK because it is vulnerable to the above
> race wherein another CPU or thread comes in and steals all of this
> thread's preallocation.

Hmm, you're right ... I've dropped the patch.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/