linux-kernel - Re: [PATCH] blk-cgroup: Fix RCU correctness warning in cfq_init

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100423144138.GA5026@redhat.com>
Date:	Fri, 23 Apr 2010 10:41:38 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	linux kernel mailing list <linux-kernel@...r.kernel.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	Li Zefan <lizf@...fujitsu.com>,
	Gui Jianfeng <guijianfeng@...fujitsu.com>
Subject: Re: [PATCH] blk-cgroup: Fix RCU correctness warning in
	cfq_init_queue()

On Thu, Apr 22, 2010 at 05:17:51PM -0700, Paul E. McKenney wrote:
> On Thu, Apr 22, 2010 at 07:55:55PM -0400, Vivek Goyal wrote:
> > On Thu, Apr 22, 2010 at 04:15:56PM -0700, Paul E. McKenney wrote:
> > > On Thu, Apr 22, 2010 at 11:54:52AM -0400, Vivek Goyal wrote:
> > > > With RCU correctness on, We see following warning. This patch fixes it.
> > > 
> > > This is in initialization code, so that there cannot be any concurrent
> > > updates, correct?  If so, looks good.
> > > 
> > 
> > I think theoritically two instances of cfq_init_queue() can be running
> > in parallel (for two different devices), and they both can call
> > blkiocg_add_blkio_group(). But then we use a spin lock to protect
> > blkio_cgroup.
> > 
> > spin_lock_irqsave(&blkcg->lock, flags);
> > 
> > So I guess two parallel updates should be fine.
> 
> OK, in that case, would it be possible add this spinlock to the condition
> checked by css_id()'s rcu_dereference_check()?

Hi Paul,

I think adding these spinlock to condition checked might become little
messy. And the reason being that this lock is subsystem (controller)
specific and maintained by controller. Now if any controller implements
a lock and we add that lock in css_id() rcu_dereference_check(), it will
look ugly.

So probably a better way is to make sure that css_id() is always called
under rcu read lock so that we don't hit this warning?

>  At first glance, css_id()
> needs to gain access to the blkio_cgroup structure that references
> the cgroup_subsys_state structure passed to css_id().
> 
> This means that there is only one blkio_cgroup structure referencing
> a given cgroup_subsys_state structure, right?  Otherwise, we could still
> have concurrent access.

Yes. In fact css object is embedded in blkio_cgroup structure. So we take
a rcu_read_lock() so that data structures associated with cgroup subsystem
don't go away and then take controller specific blkio_cgroup spin lock to
make sure multiple writers don't end up modifying a list at the same time.

Am I missing something.

Thanks
Vivek
 

> > > (Just wanting to make sure that we are not papering over a real error!)
> > > 
> > > 							Thanx, Paul
> > > 
> > > > [  103.790505] ===================================================
> > > > [  103.790509] [ INFO: suspicious rcu_dereference_check() usage. ]
> > > > [  103.790511] ---------------------------------------------------
> > > > [  103.790514] kernel/cgroup.c:4432 invoked rcu_dereference_check() without protection!
> > > > [  103.790517]
> > > > [  103.790517] other info that might help us debug this:
> > > > [  103.790519]
> > > > [  103.790521]
> > > > [  103.790521] rcu_scheduler_active = 1, debug_locks = 1
> > > > [  103.790524] 4 locks held by bash/4422:
> > > > [  103.790526]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff8114befa>] sysfs_write_file+0x3c/0x144
> > > > [  103.790537]  #1:  (s_active#102){.+.+.+}, at: [<ffffffff8114bfa5>] sysfs_write_file+0xe7/0x144
> > > > [  103.790544]  #2:  (&q->sysfs_lock){+.+.+.}, at: [<ffffffff812263b1>] queue_attr_store+0x49/0x8f
> > > > [  103.790552]  #3:  (&(&blkcg->lock)->rlock){......}, at: [<ffffffff8122e4db>] blkiocg_add_blkio_group+0x2b/0xad
> > > > [  103.790560]
> > > > [  103.790561] stack backtrace:
> > > > [  103.790564] Pid: 4422, comm: bash Not tainted 2.6.34-rc4-blkio-second-crash #81
> > > > [  103.790567] Call Trace:
> > > > [  103.790572]  [<ffffffff81068f57>] lockdep_rcu_dereference+0x9d/0xa5
> > > > [  103.790577]  [<ffffffff8107fac1>] css_id+0x44/0x57
> > > > [  103.790581]  [<ffffffff8122e503>] blkiocg_add_blkio_group+0x53/0xad
> > > > [  103.790586]  [<ffffffff81231936>] cfq_init_queue+0x139/0x32c
> > > > [  103.790591]  [<ffffffff8121f2d0>] elv_iosched_store+0xbf/0x1bf
> > > > [  103.790595]  [<ffffffff812263d8>] queue_attr_store+0x70/0x8f
> > > > [  103.790599]  [<ffffffff8114bfa5>] ? sysfs_write_file+0xe7/0x144
> > > > [  103.790603]  [<ffffffff8114bfc6>] sysfs_write_file+0x108/0x144
> > > > [  103.790609]  [<ffffffff810f527f>] vfs_write+0xae/0x10b
> > > > [  103.790612]  [<ffffffff81069863>] ? trace_hardirqs_on_caller+0x10c/0x130
> > > > [  103.790616]  [<ffffffff810f539c>] sys_write+0x4a/0x6e
> > > > [  103.790622]  [<ffffffff81002b5b>] system_call_fastpath+0x16/0x1b
> > > > [  103.790625]
> > > > 
> > > > Signed-off-by: Vivek Goyal <vgoyal@...hat.com>
> > > > ---
> > > >  block/cfq-iosched.c |    2 ++
> > > >  1 files changed, 2 insertions(+), 0 deletions(-)
> > > > 
> > > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > > > index 002a5b6..9386bf8 100644
> > > > --- a/block/cfq-iosched.c
> > > > +++ b/block/cfq-iosched.c
> > > > @@ -3741,8 +3741,10 @@ static void *cfq_init_queue(struct request_queue *q)
> > > >  	 * to make sure that cfq_put_cfqg() does not try to kfree root group
> > > >  	 */
> > > >  	atomic_set(&cfqg->ref, 1);
> > > > +	rcu_read_lock();
> > > >  	blkiocg_add_blkio_group(&blkio_root_cgroup, &cfqg->blkg, (void *)cfqd,
> > > >  					0);
> > > > +	rcu_read_unlock();
> > > >  #endif
> > > >  	/*
> > > >  	 * Not strictly needed (since RB_ROOT just clears the node and we
> > > > -- 
> > > > 1.6.2.5
> > > > 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/