[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1284692130.26423.53.camel@mulgrave.site>
Date: Thu, 16 Sep 2010 22:55:30 -0400
From: James Bottomley <James.Bottomley@...e.de>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Naohiro Aota <naota@...sp.net>,
FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
linux-kernel@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
Jiri Kosina <jkosina@...e.cz>,
Greg Kroah-Hartman <gregkh@...e.de>,
Daniel Mack <daniel@...aq.de>, linux-scsi@...r.kernel.org
Subject: Re: [PATCH] bsg: call idr_pre_get() outside the lock.
On Thu, 2010-09-16 at 16:37 -0700, Andrew Morton wrote:
> On Wed, 01 Sep 2010 23:26:01 +0900
> Naohiro Aota <naota@...sp.net> wrote:
>
> > The idr_pre_get() kernel-doc says "This function should be called
> > prior to locking and calling the idr_get_new* functions.", but the
> > idr_pre_get() calling in bsg_register_queue() is put inside
> > mutex_lock(). Let's fix it.
> >
>
> The idr_pre_get() kerneldoc is wrong. Or at least, misleading.
>
> The way this all works is that we precharge the tree via idr_pre_get()
> and we do this outside locks so we can use GFP_KERNEL. Then we take
> the lock (a spinlock!) and then try to add an element to the tree,
> which will consume objects from the pool which was prefilled by
> idr_pre_get().
>
> There's an obvious race here where someone else can get in and steal
> objects from the prefilled pool. We can fix that with a retry loop:
>
>
> again:
> if (idr_pre_get(..., GFP_KERNEL) == NULL)
> return -ENOMEM; /* We're really out-of-memory */
> spin_lock(lock);
> if (idr_get_new(...) == -EAGAIN) {
> spin_unlock(lock);
> goto again; /* Someone stole our preallocation! */
> }
> ...
>
> this way we avoid the false -ENOMEM which the race would have caused.
> We only declare -ENOMEM when we're REALLY out of memory.
>
>
> But none of this is needed when a sleeping lock is used (as long as the
> sleeping lock isn't taken on the the VM pageout path, of course). In
> that case we can use the sleeping lock to prevent the above race.
>
> > diff --git a/block/bsg.c b/block/bsg.c
> > index 82d5882..5fd8dd1 100644
> > --- a/block/bsg.c
> > +++ b/block/bsg.c
> > @@ -1010,13 +1010,11 @@ int bsg_register_queue(struct request_queue *q, struct device *parent,
> > bcd = &q->bsg_dev;
> > memset(bcd, 0, sizeof(*bcd));
> >
> > - mutex_lock(&bsg_mutex);
> > -
> > ret = idr_pre_get(&bsg_minor_idr, GFP_KERNEL);
> > - if (!ret) {
> > - ret = -ENOMEM;
> > - goto unlock;
> > - }
> > + if (!ret)
> > + return -ENOMEM;
> > +
> > + mutex_lock(&bsg_mutex);
> >
> > ret = idr_get_new(&bsg_minor_idr, bcd, &minor);
> > if (ret < 0)
>
> So the old code was OK.
>
> The new code, however, is not OK because it is vulnerable to the above
> race wherein another CPU or thread comes in and steals all of this
> thread's preallocation.
Hmm, you're right ... I've dropped the patch.
James
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists