[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100423001751.GX2524@linux.vnet.ibm.com>
Date: Thu, 22 Apr 2010 17:17:51 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Vivek Goyal <vgoyal@...hat.com>
Cc: linux kernel mailing list <linux-kernel@...r.kernel.org>,
Jens Axboe <jens.axboe@...cle.com>,
Li Zefan <lizf@...fujitsu.com>,
Gui Jianfeng <guijianfeng@...fujitsu.com>
Subject: Re: [PATCH] blk-cgroup: Fix RCU correctness warning in
cfq_init_queue()
On Thu, Apr 22, 2010 at 07:55:55PM -0400, Vivek Goyal wrote:
> On Thu, Apr 22, 2010 at 04:15:56PM -0700, Paul E. McKenney wrote:
> > On Thu, Apr 22, 2010 at 11:54:52AM -0400, Vivek Goyal wrote:
> > > With RCU correctness on, We see following warning. This patch fixes it.
> >
> > This is in initialization code, so that there cannot be any concurrent
> > updates, correct? If so, looks good.
> >
>
> I think theoritically two instances of cfq_init_queue() can be running
> in parallel (for two different devices), and they both can call
> blkiocg_add_blkio_group(). But then we use a spin lock to protect
> blkio_cgroup.
>
> spin_lock_irqsave(&blkcg->lock, flags);
>
> So I guess two parallel updates should be fine.
OK, in that case, would it be possible add this spinlock to the condition
checked by css_id()'s rcu_dereference_check()? At first glance, css_id()
needs to gain access to the blkio_cgroup structure that references
the cgroup_subsys_state structure passed to css_id().
This means that there is only one blkio_cgroup structure referencing
a given cgroup_subsys_state structure, right? Otherwise, we could still
have concurrent access.
Thanx, Paul
> Thanks
> Vivek
>
> > (Just wanting to make sure that we are not papering over a real error!)
> >
> > Thanx, Paul
> >
> > > [ 103.790505] ===================================================
> > > [ 103.790509] [ INFO: suspicious rcu_dereference_check() usage. ]
> > > [ 103.790511] ---------------------------------------------------
> > > [ 103.790514] kernel/cgroup.c:4432 invoked rcu_dereference_check() without protection!
> > > [ 103.790517]
> > > [ 103.790517] other info that might help us debug this:
> > > [ 103.790519]
> > > [ 103.790521]
> > > [ 103.790521] rcu_scheduler_active = 1, debug_locks = 1
> > > [ 103.790524] 4 locks held by bash/4422:
> > > [ 103.790526] #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff8114befa>] sysfs_write_file+0x3c/0x144
> > > [ 103.790537] #1: (s_active#102){.+.+.+}, at: [<ffffffff8114bfa5>] sysfs_write_file+0xe7/0x144
> > > [ 103.790544] #2: (&q->sysfs_lock){+.+.+.}, at: [<ffffffff812263b1>] queue_attr_store+0x49/0x8f
> > > [ 103.790552] #3: (&(&blkcg->lock)->rlock){......}, at: [<ffffffff8122e4db>] blkiocg_add_blkio_group+0x2b/0xad
> > > [ 103.790560]
> > > [ 103.790561] stack backtrace:
> > > [ 103.790564] Pid: 4422, comm: bash Not tainted 2.6.34-rc4-blkio-second-crash #81
> > > [ 103.790567] Call Trace:
> > > [ 103.790572] [<ffffffff81068f57>] lockdep_rcu_dereference+0x9d/0xa5
> > > [ 103.790577] [<ffffffff8107fac1>] css_id+0x44/0x57
> > > [ 103.790581] [<ffffffff8122e503>] blkiocg_add_blkio_group+0x53/0xad
> > > [ 103.790586] [<ffffffff81231936>] cfq_init_queue+0x139/0x32c
> > > [ 103.790591] [<ffffffff8121f2d0>] elv_iosched_store+0xbf/0x1bf
> > > [ 103.790595] [<ffffffff812263d8>] queue_attr_store+0x70/0x8f
> > > [ 103.790599] [<ffffffff8114bfa5>] ? sysfs_write_file+0xe7/0x144
> > > [ 103.790603] [<ffffffff8114bfc6>] sysfs_write_file+0x108/0x144
> > > [ 103.790609] [<ffffffff810f527f>] vfs_write+0xae/0x10b
> > > [ 103.790612] [<ffffffff81069863>] ? trace_hardirqs_on_caller+0x10c/0x130
> > > [ 103.790616] [<ffffffff810f539c>] sys_write+0x4a/0x6e
> > > [ 103.790622] [<ffffffff81002b5b>] system_call_fastpath+0x16/0x1b
> > > [ 103.790625]
> > >
> > > Signed-off-by: Vivek Goyal <vgoyal@...hat.com>
> > > ---
> > > block/cfq-iosched.c | 2 ++
> > > 1 files changed, 2 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > > index 002a5b6..9386bf8 100644
> > > --- a/block/cfq-iosched.c
> > > +++ b/block/cfq-iosched.c
> > > @@ -3741,8 +3741,10 @@ static void *cfq_init_queue(struct request_queue *q)
> > > * to make sure that cfq_put_cfqg() does not try to kfree root group
> > > */
> > > atomic_set(&cfqg->ref, 1);
> > > + rcu_read_lock();
> > > blkiocg_add_blkio_group(&blkio_root_cgroup, &cfqg->blkg, (void *)cfqd,
> > > 0);
> > > + rcu_read_unlock();
> > > #endif
> > > /*
> > > * Not strictly needed (since RB_ROOT just clears the node and we
> > > --
> > > 1.6.2.5
> > >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists