netdev - Re: [PATCH] RCU: don't turn off lockdep when find suspicious rcu_dereference

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100422145640.GB3228@redhat.com>
Date:	Thu, 22 Apr 2010 10:56:40 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Miles Lane <miles.lane@...il.com>, Eric Paris <eparis@...hat.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>, nauman@...gle.com,
	eric.dumazet@...il.com, netdev@...r.kernel.org,
	Jens Axboe <jens.axboe@...cle.com>,
	Gui Jianfeng <guijianfeng@...fujitsu.com>,
	Li Zefan <lizf@...fujitsu.com>
Subject: Re: [PATCH] RCU: don't turn off lockdep when find suspicious
	rcu_dereference_check() usage

On Wed, Apr 21, 2010 at 02:35:43PM -0700, Paul E. McKenney wrote:

[..]
> > [    3.116754] [ INFO: suspicious rcu_dereference_check() usage. ]
> > [    3.116754] ---------------------------------------------------
> > [    3.116754] kernel/cgroup.c:4432 invoked rcu_dereference_check()
> > without protection!
> > [    3.116754]
> > [    3.116754] other info that might help us debug this:
> > [    3.116754]
> > [    3.116754]
> > [    3.116754] rcu_scheduler_active = 1, debug_locks = 1
> > [    3.116754] 2 locks held by async/1/666:
> > [    3.116754]  #0:  (&shost->scan_mutex){+.+.+.}, at:
> > [<ffffffff812df0a0>] __scsi_add_device+0x83/0xe4
> > [    3.116754]  #1:  (&(&blkcg->lock)->rlock){......}, at:
> > [<ffffffff811f2e8d>] blkiocg_add_blkio_group+0x29/0x7f
> > [    3.116754]
> > [    3.116754] stack backtrace:
> > [    3.116754] Pid: 666, comm: async/1 Not tainted 2.6.34-rc5 #18
> > [    3.116754] Call Trace:
> > [    3.116754]  [<ffffffff81067fc2>] lockdep_rcu_dereference+0x9d/0xa5
> > [    3.116754]  [<ffffffff8107f9b1>] css_id+0x3f/0x51
> > [    3.116754]  [<ffffffff811f2e9c>] blkiocg_add_blkio_group+0x38/0x7f
> > [    3.116754]  [<ffffffff811f4e64>] cfq_init_queue+0xdf/0x2dc
> > [    3.116754]  [<ffffffff811e3445>] elevator_init+0xba/0xf5
> > [    3.116754]  [<ffffffff812dc02a>] ? scsi_request_fn+0x0/0x451
> > [    3.116754]  [<ffffffff811e696b>] blk_init_queue_node+0x12f/0x135
> > [    3.116754]  [<ffffffff811e697d>] blk_init_queue+0xc/0xe
> > [    3.116754]  [<ffffffff812dc49c>] __scsi_alloc_queue+0x21/0x111
> > [    3.116754]  [<ffffffff812dc5a4>] scsi_alloc_queue+0x18/0x64
> > [    3.116754]  [<ffffffff812de5a0>] scsi_alloc_sdev+0x19e/0x256
> > [    3.116754]  [<ffffffff812de73e>] scsi_probe_and_add_lun+0xe6/0x9c5
> > [    3.116754]  [<ffffffff81068922>] ? trace_hardirqs_on_caller+0x114/0x13f
> > [    3.116754]  [<ffffffff813ce0d6>] ? __mutex_lock_common+0x3e4/0x43a
> > [    3.116754]  [<ffffffff812df0a0>] ? __scsi_add_device+0x83/0xe4
> > [    3.116754]  [<ffffffff812d0a5c>] ? transport_setup_classdev+0x0/0x17
> > [    3.116754]  [<ffffffff812df0a0>] ? __scsi_add_device+0x83/0xe4
> > [    3.116754]  [<ffffffff812df0d5>] __scsi_add_device+0xb8/0xe4
> > [    3.116754]  [<ffffffff812ea9c5>] ata_scsi_scan_host+0x74/0x16e
> > [    3.116754]  [<ffffffff81057685>] ? autoremove_wake_function+0x0/0x34
> > [    3.116754]  [<ffffffff812e8e64>] async_port_probe+0xab/0xb7
> > [    3.116754]  [<ffffffff8105e1b5>] ? async_thread+0x0/0x1f4
> > [    3.116754]  [<ffffffff8105e2ba>] async_thread+0x105/0x1f4
> > [    3.116754]  [<ffffffff81033d79>] ? default_wake_function+0x0/0xf
> > [    3.116754]  [<ffffffff8105e1b5>] ? async_thread+0x0/0x1f4
> > [    3.116754]  [<ffffffff8105713e>] kthread+0x89/0x91
> > [    3.116754]  [<ffffffff81068922>] ? trace_hardirqs_on_caller+0x114/0x13f
> > [    3.116754]  [<ffffffff81003994>] kernel_thread_helper+0x4/0x10
> > [    3.116754]  [<ffffffff813cfcc0>] ? restore_args+0x0/0x30
> > [    3.116754]  [<ffffffff810570b5>] ? kthread+0x0/0x91
> > [    3.116754]  [<ffffffff81003990>] ? kernel_thread_helper+0x0/0x10
> 
> I cannot convince myself that the above access is safe.  Vivek, Nauman,
> thoughts?

Hi Paul,

blkiocg_add_blkio_group() is called from two paths.

First one is following. This path should be safe as it takes rcu read
lock.

cfq_get_cfqg()
	rcu_read_lock()
	cfq_find_alloc_cfqg()
		blkiocg_add_blkio_group()
	rcu_read_unlock()

Second one is as shown in above backtrace.

cfq_init_queue()
	blkiocg_add_blkio_group().

This path is called at request queue and cfq initialization time and
we access only root cgroup (root blkio_cgroup). As root cgroup can't
go away, do we have to protect that call also using rcu_read_lock()?

So I guess it is not unsafe but propably we need to fix the warning, I
should wrap second call to blkiocg_add_blkio_group() with
rcu_read_lock/unlock pair?

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html