linux-kernel - Re: [PATCH] blk-cgroup: Fix RCU correctness warning in cfq_init

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100423194649.GF2589@linux.vnet.ibm.com>
Date:	Fri, 23 Apr 2010 12:46:49 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	linux kernel mailing list <linux-kernel@...r.kernel.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	Li Zefan <lizf@...fujitsu.com>,
	Gui Jianfeng <guijianfeng@...fujitsu.com>
Subject: Re: [PATCH] blk-cgroup: Fix RCU correctness warning in
 cfq_init_queue()

On Fri, Apr 23, 2010 at 10:41:38AM -0400, Vivek Goyal wrote:
> On Thu, Apr 22, 2010 at 05:17:51PM -0700, Paul E. McKenney wrote:
> > On Thu, Apr 22, 2010 at 07:55:55PM -0400, Vivek Goyal wrote:
> > > On Thu, Apr 22, 2010 at 04:15:56PM -0700, Paul E. McKenney wrote:
> > > > On Thu, Apr 22, 2010 at 11:54:52AM -0400, Vivek Goyal wrote:
> > > > > With RCU correctness on, We see following warning. This patch fixes it.
> > > > 
> > > > This is in initialization code, so that there cannot be any concurrent
> > > > updates, correct?  If so, looks good.
> > > > 
> > > 
> > > I think theoritically two instances of cfq_init_queue() can be running
> > > in parallel (for two different devices), and they both can call
> > > blkiocg_add_blkio_group(). But then we use a spin lock to protect
> > > blkio_cgroup.
> > > 
> > > spin_lock_irqsave(&blkcg->lock, flags);
> > > 
> > > So I guess two parallel updates should be fine.
> > 
> > OK, in that case, would it be possible add this spinlock to the condition
> > checked by css_id()'s rcu_dereference_check()?
> 
> Hi Paul,
> 
> I think adding these spinlock to condition checked might become little
> messy. And the reason being that this lock is subsystem (controller)
> specific and maintained by controller. Now if any controller implements
> a lock and we add that lock in css_id() rcu_dereference_check(), it will
> look ugly.
> 
> So probably a better way is to make sure that css_id() is always called
> under rcu read lock so that we don't hit this warning?

As long as holding rcu_read_lock() prevents css_id() from the usual
problems such as access memory that was concurrently freed, yes.

> >  At first glance, css_id()
> > needs to gain access to the blkio_cgroup structure that references
> > the cgroup_subsys_state structure passed to css_id().
> > 
> > This means that there is only one blkio_cgroup structure referencing
> > a given cgroup_subsys_state structure, right?  Otherwise, we could still
> > have concurrent access.
> 
> Yes. In fact css object is embedded in blkio_cgroup structure. So we take
> a rcu_read_lock() so that data structures associated with cgroup subsystem
> don't go away and then take controller specific blkio_cgroup spin lock to
> make sure multiple writers don't end up modifying a list at the same time.
> 
> Am I missing something.

This sounds very good!

I did have to ask!  ;-)

							Thanx, Paul

> Thanks
> Vivek
> 
> 
> > > > (Just wanting to make sure that we are not papering over a real error!)
> > > > 
> > > > 							Thanx, Paul
> > > > 
> > > > > [  103.790505] ===================================================
> > > > > [  103.790509] [ INFO: suspicious rcu_dereference_check() usage. ]
> > > > > [  103.790511] ---------------------------------------------------
> > > > > [  103.790514] kernel/cgroup.c:4432 invoked rcu_dereference_check() without protection!
> > > > > [  103.790517]
> > > > > [  103.790517] other info that might help us debug this:
> > > > > [  103.790519]
> > > > > [  103.790521]
> > > > > [  103.790521] rcu_scheduler_active = 1, debug_locks = 1
> > > > > [  103.790524] 4 locks held by bash/4422:
> > > > > [  103.790526]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff8114befa>] sysfs_write_file+0x3c/0x144
> > > > > [  103.790537]  #1:  (s_active#102){.+.+.+}, at: [<ffffffff8114bfa5>] sysfs_write_file+0xe7/0x144
> > > > > [  103.790544]  #2:  (&q->sysfs_lock){+.+.+.}, at: [<ffffffff812263b1>] queue_attr_store+0x49/0x8f
> > > > > [  103.790552]  #3:  (&(&blkcg->lock)->rlock){......}, at: [<ffffffff8122e4db>] blkiocg_add_blkio_group+0x2b/0xad
> > > > > [  103.790560]
> > > > > [  103.790561] stack backtrace:
> > > > > [  103.790564] Pid: 4422, comm: bash Not tainted 2.6.34-rc4-blkio-second-crash #81
> > > > > [  103.790567] Call Trace:
> > > > > [  103.790572]  [<ffffffff81068f57>] lockdep_rcu_dereference+0x9d/0xa5
> > > > > [  103.790577]  [<ffffffff8107fac1>] css_id+0x44/0x57
> > > > > [  103.790581]  [<ffffffff8122e503>] blkiocg_add_blkio_group+0x53/0xad
> > > > > [  103.790586]  [<ffffffff81231936>] cfq_init_queue+0x139/0x32c
> > > > > [  103.790591]  [<ffffffff8121f2d0>] elv_iosched_store+0xbf/0x1bf
> > > > > [  103.790595]  [<ffffffff812263d8>] queue_attr_store+0x70/0x8f
> > > > > [  103.790599]  [<ffffffff8114bfa5>] ? sysfs_write_file+0xe7/0x144
> > > > > [  103.790603]  [<ffffffff8114bfc6>] sysfs_write_file+0x108/0x144
> > > > > [  103.790609]  [<ffffffff810f527f>] vfs_write+0xae/0x10b
> > > > > [  103.790612]  [<ffffffff81069863>] ? trace_hardirqs_on_caller+0x10c/0x130
> > > > > [  103.790616]  [<ffffffff810f539c>] sys_write+0x4a/0x6e
> > > > > [  103.790622]  [<ffffffff81002b5b>] system_call_fastpath+0x16/0x1b
> > > > > [  103.790625]
> > > > > 
> > > > > Signed-off-by: Vivek Goyal <vgoyal@...hat.com>
> > > > > ---
> > > > >  block/cfq-iosched.c |    2 ++
> > > > >  1 files changed, 2 insertions(+), 0 deletions(-)
> > > > > 
> > > > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > > > > index 002a5b6..9386bf8 100644
> > > > > --- a/block/cfq-iosched.c
> > > > > +++ b/block/cfq-iosched.c
> > > > > @@ -3741,8 +3741,10 @@ static void *cfq_init_queue(struct request_queue *q)
> > > > >  	 * to make sure that cfq_put_cfqg() does not try to kfree root group
> > > > >  	 */
> > > > >  	atomic_set(&cfqg->ref, 1);
> > > > > +	rcu_read_lock();
> > > > >  	blkiocg_add_blkio_group(&blkio_root_cgroup, &cfqg->blkg, (void *)cfqd,
> > > > >  					0);
> > > > > +	rcu_read_unlock();
> > > > >  #endif
> > > > >  	/*
> > > > >  	 * Not strictly needed (since RB_ROOT just clears the node and we
> > > > > -- 
> > > > > 1.6.2.5
> > > > > 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/