linux-kernel - Re: [PATCH] RCU: don't turn off lockdep when find suspicious rcu_dereference

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100425023455.GM2440@linux.vnet.ibm.com>
Date:	Sat, 24 Apr 2010 19:34:55 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Miles Lane <miles.lane@...il.com>
Cc:	Vivek Goyal <vgoyal@...hat.com>, Eric Paris <eparis@...hat.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>, nauman@...gle.com,
	eric.dumazet@...il.com, netdev@...r.kernel.org,
	Jens Axboe <jens.axboe@...cle.com>,
	Gui Jianfeng <guijianfeng@...fujitsu.com>,
	Li Zefan <lizf@...fujitsu.com>,
	Johannes Berg <johannes@...solutions.net>
Subject: Re: [PATCH] RCU: don't turn off lockdep when find suspicious
 rcu_dereference_check() usage

On Fri, Apr 23, 2010 at 06:59:12PM -0400, Miles Lane wrote:
> On Fri, Apr 23, 2010 at 3:42 PM, Paul E. McKenney
> <paulmck@...ux.vnet.ibm.com> wrote:
> > On Fri, Apr 23, 2010 at 08:50:59AM -0400, Miles Lane wrote:
> >> Hi Paul,
> >> There has been a bit of back and forth, and I am not sure what patches
> >> I should test now.
> >> Could you send me a bundle of whatever needs testing now?
> >
> > Hello, Miles,
> >
> > I am posting my set as replies to this message.  There are a couple
> > of KVM fixes that are going up via Avi's tree, and a number of networking
> > fixes that are going up via Dave Miller's tree -- a number of these
> > are against quickly changing code, so it didn't make sense for me to
> > keep them separately.
> >
> > I believe that the two splats below are addressed by this patch set
> > carried in the networking tree:
> >
> >        https://patchwork.kernel.org/patch/90754/
> 
> With your twelve patches and the one linked to above applied to
> 2.6.34-rc5-git3, here are the warnings I see:
> 
> [    0.173969] [ INFO: suspicious rcu_dereference_check() usage. ]
> [    0.174097] ---------------------------------------------------
> [    0.174226] include/linux/cgroup.h:534 invoked
> rcu_dereference_check() without protection!
> [    0.174429]
> [    0.174430] other info that might help us debug this:
> [    0.174431]
> [    0.174792]
> [    0.174793] rcu_scheduler_active = 1, debug_locks = 1
> [    0.175037] no locks held by watchdog/0/5.
> [    0.175162]
> [    0.175163] stack backtrace:
> [    0.175405] Pid: 5, comm: watchdog/0 Not tainted 2.6.34-rc5-git3 #22
> [    0.175534] Call Trace:
> [    0.175666]  [<ffffffff81067fbe>] lockdep_rcu_dereference+0x9d/0xa5
> [    0.175799]  [<ffffffff8102d678>] task_subsys_state+0x59/0x70
> [    0.175931]  [<ffffffff810328fa>] __sched_setscheduler+0x19d/0x300
> [    0.176064]  [<ffffffff8102b477>] ? need_resched+0x1e/0x28
> [    0.176196]  [<ffffffff813cd401>] ? schedule+0x5c3/0x66e
> [    0.176327]  [<ffffffff81091943>] ? watchdog+0x0/0x8c
> [    0.176457]  [<ffffffff81032a78>] sched_setscheduler+0xe/0x10
> [    0.176587]  [<ffffffff8109196d>] watchdog+0x2a/0x8c
> [    0.176677]  [<ffffffff81091943>] ? watchdog+0x0/0x8c
> [    0.176808]  [<ffffffff81057152>] kthread+0x89/0x91
> [    0.176939]  [<ffffffff8106891e>] ? trace_hardirqs_on_caller+0x114/0x13f
> [    0.177073]  [<ffffffff81003994>] kernel_thread_helper+0x4/0x10
> [    0.177204]  [<ffffffff813cfc40>] ? restore_args+0x0/0x30
> [    0.177334]  [<ffffffff810570c9>] ? kthread+0x0/0x91
> [    0.177463]  [<ffffffff81003990>] ? kernel_thread_helper+0x0/0x10

According to Documentation/cgroups/cgroups.txt, we must hold cgroup_mutex,
the task's task_alloc lock, or be in an RCU read-side critical section.
We are in neither of these.

I would argue that sched_setscheduler() should take care of
synchronization, but am not sure which of these three are appropriate
for sched_setscheduler() to acquire.  Peter, thoughts?

> [    3.173419] [ INFO: suspicious rcu_dereference_check() usage. ]
> [    3.173419] ---------------------------------------------------
> [    3.173419] kernel/cgroup.c:4438 invoked rcu_dereference_check()
> without protection!
> [    3.173419]
> [    3.173419] other info that might help us debug this:
> [    3.173419]
> [    3.173419]
> [    3.173419] rcu_scheduler_active = 1, debug_locks = 1
> [    3.173419] 2 locks held by async/0/668:
> [    3.173419]  #0:  (&shost->scan_mutex){+.+.+.}, at:
> [<ffffffff812df020>] __scsi_add_device+0x83/0xe4
> [    3.173419]  #1:  (&(&blkcg->lock)->rlock){......}, at:
> [<ffffffff811f2df9>] blkiocg_add_blkio_group+0x29/0x7f
> [    3.173419]
> [    3.173419] stack backtrace:
> [    3.173419] Pid: 668, comm: async/0 Not tainted 2.6.34-rc5-git3 #22
> [    3.173419] Call Trace:
> [    3.173419]  [<ffffffff81067fbe>] lockdep_rcu_dereference+0x9d/0xa5
> [    3.173419]  [<ffffffff8107f9ad>] css_id+0x3f/0x51
> [    3.173419]  [<ffffffff811f2e08>] blkiocg_add_blkio_group+0x38/0x7f
> [    3.173419]  [<ffffffff811f4dd0>] cfq_init_queue+0xdf/0x2dc
> [    3.173419]  [<ffffffff811e33b1>] elevator_init+0xba/0xf5
> [    3.173419]  [<ffffffff812dbfaa>] ? scsi_request_fn+0x0/0x451
> [    3.173419]  [<ffffffff811e68d7>] blk_init_queue_node+0x12f/0x135
> [    3.173419]  [<ffffffff811e68e9>] blk_init_queue+0xc/0xe
> [    3.173419]  [<ffffffff812dc41c>] __scsi_alloc_queue+0x21/0x111
> [    3.173419]  [<ffffffff812dc524>] scsi_alloc_queue+0x18/0x64
> [    3.173419]  [<ffffffff812de520>] scsi_alloc_sdev+0x19e/0x256
> [    3.173419]  [<ffffffff812de6be>] scsi_probe_and_add_lun+0xe6/0x9c5
> [    3.173419]  [<ffffffff8106891e>] ? trace_hardirqs_on_caller+0x114/0x13f
> [    3.173419]  [<ffffffff813ce056>] ? __mutex_lock_common+0x3e4/0x43a
> [    3.173419]  [<ffffffff812df020>] ? __scsi_add_device+0x83/0xe4
> [    3.173419]  [<ffffffff812d09dc>] ? transport_setup_classdev+0x0/0x17
> [    3.173419]  [<ffffffff812df020>] ? __scsi_add_device+0x83/0xe4
> [    3.173419]  [<ffffffff812df055>] __scsi_add_device+0xb8/0xe4
> [    3.173419]  [<ffffffff812ea945>] ata_scsi_scan_host+0x74/0x16e
> [    3.173419]  [<ffffffff81057699>] ? autoremove_wake_function+0x0/0x34
> [    3.173419]  [<ffffffff812e8de4>] async_port_probe+0xab/0xb7
> [    3.173419]  [<ffffffff8105e1b1>] ? async_thread+0x0/0x1f4
> [    3.173419]  [<ffffffff8105e2b6>] async_thread+0x105/0x1f4
> [    3.173419]  [<ffffffff81033d8e>] ? default_wake_function+0x0/0xf
> [    3.173419]  [<ffffffff8105e1b1>] ? async_thread+0x0/0x1f4
> [    3.173419]  [<ffffffff81057152>] kthread+0x89/0x91
> [    3.173419]  [<ffffffff8106891e>] ? trace_hardirqs_on_caller+0x114/0x13f
> [    3.173419]  [<ffffffff81003994>] kernel_thread_helper+0x4/0x10
> [    3.173419]  [<ffffffff813cfc40>] ? restore_args+0x0/0x30
> [    3.173419]  [<ffffffff810570c9>] ? kthread+0x0/0x91
> [    3.173419]  [<ffffffff81003990>] ? kernel_thread_helper+0x0/0x10

Please see below for a patch for this based on my earlier conversation
with Vivek Goyal.  (Vivek, if you are already pushing a fix elsewhere,
please let me know, and I will drop my patch in favor of yours.)

> [   32.905446] [ INFO: suspicious rcu_dereference_check() usage. ]
> [   32.905449] ---------------------------------------------------
> [   32.905453] net/core/dev.c:1993 invoked rcu_dereference_check()
> without protection!
> [   32.905456]
> [   32.905457] other info that might help us debug this:
> [   32.905458]
> [   32.905461]
> [   32.905462] rcu_scheduler_active = 1, debug_locks = 1
> [   32.905466] 2 locks held by canberra-gtk-pl/4182:
> [   32.905469]  #0:  (sk_lock-AF_INET){+.+.+.}, at:
> [<ffffffff81394f7d>] inet_stream_connect+0x3a/0x24d
> [   32.905483]  #1:  (rcu_read_lock_bh){.+....}, at:
> [<ffffffff8134a789>] dev_queue_xmit+0x14e/0x4b8
> [   32.905495]
> [   32.905496] stack backtrace:
> [   32.905500] Pid: 4182, comm: canberra-gtk-pl Not tainted 2.6.34-rc5-git3 #22
> [   32.905504] Call Trace:
> [   32.905512]  [<ffffffff81067fbe>] lockdep_rcu_dereference+0x9d/0xa5
> [   32.905518]  [<ffffffff8134a894>] dev_queue_xmit+0x259/0x4b8
> [   32.905524]  [<ffffffff8134a789>] ? dev_queue_xmit+0x14e/0x4b8
> [   32.905531]  [<ffffffff81041c66>] ? _local_bh_enable_ip+0xcd/0xda
> [   32.905538]  [<ffffffff813536da>] neigh_resolve_output+0x234/0x285
> [   32.905544]  [<ffffffff8136f69f>] ip_finish_output2+0x257/0x28c
> [   32.905549]  [<ffffffff8136f73c>] ip_finish_output+0x68/0x6a
> [   32.905554]  [<ffffffff81370433>] T.866+0x52/0x59
> [   32.905559]  [<ffffffff8137067e>] ip_output+0xaa/0xb4
> [   32.905565]  [<ffffffff8136eb38>] ip_local_out+0x20/0x24
> [   32.905571]  [<ffffffff8136f184>] ip_queue_xmit+0x309/0x368
> [   32.905578]  [<ffffffff810e4226>] ? __kmalloc_track_caller+0x111/0x155
> [   32.905585]  [<ffffffff8138316f>] ? tcp_connect+0x223/0x3d3
> [   32.905591]  [<ffffffff813818f1>] tcp_transmit_skb+0x707/0x745
> [   32.905597]  [<ffffffff813832c2>] tcp_connect+0x376/0x3d3
> [   32.905604]  [<ffffffff81268a43>] ? secure_tcp_sequence_number+0x55/0x6f
> [   32.905610]  [<ffffffff81387270>] tcp_v4_connect+0x3df/0x455
> [   32.905617]  [<ffffffff8133cb59>] ? lock_sock_nested+0xf3/0x102
> [   32.905623]  [<ffffffff81394fe7>] inet_stream_connect+0xa4/0x24d
> [   32.905629]  [<ffffffff8133b398>] sys_connect+0x90/0xd0
> [   32.905636]  [<ffffffff81002b9c>] ? sysret_check+0x27/0x62
> [   32.905642]  [<ffffffff8106891e>] ? trace_hardirqs_on_caller+0x114/0x13f
> [   32.905649]  [<ffffffff813cec80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [   32.905655]  [<ffffffff81002b6b>] system_call_fastpath+0x16/0x1b

A fix for the above is already in Dave Miller's tree.

> [   51.912282] [ INFO: suspicious rcu_dereference_check() usage. ]
> [   51.912285] ---------------------------------------------------
> [   51.912289] net/mac80211/sta_info.c:886 invoked
> rcu_dereference_check() without protection!
> [   51.912293]
> [   51.912293] other info that might help us debug this:
> [   51.912295]
> [   51.912298]
> [   51.912298] rcu_scheduler_active = 1, debug_locks = 1
> [   51.912302] no locks held by wpa_supplicant/3951.
> [   51.912305]
> [   51.912306] stack backtrace:
> [   51.912310] Pid: 3951, comm: wpa_supplicant Not tainted 2.6.34-rc5-git3 #22
> [   51.912314] Call Trace:
> [   51.912317]  <IRQ>  [<ffffffff81067fbe>] lockdep_rcu_dereference+0x9d/0xa5
> [   51.912345]  [<ffffffffa014f9ae>]
> ieee80211_find_sta_by_hw+0x46/0x10f [mac80211]
> [   51.912358]  [<ffffffffa014fa8e>] ieee80211_find_sta+0x17/0x19 [mac80211]
> [   51.912373]  [<ffffffffa01e50f2>] iwl_tx_queue_reclaim+0xdb/0x1b1 [iwlcore]
> [   51.912380]  [<ffffffff8106842b>] ? mark_lock+0x2d/0x235
> [   51.912391]  [<ffffffffa0252f1c>] iwl5000_rx_reply_tx+0x4a9/0x556 [iwlagn]
> [   51.912399]  [<ffffffff8120a353>] ? is_swiotlb_buffer+0x2e/0x3b
> [   51.912407]  [<ffffffffa024bbf4>] iwl_rx_handle+0x163/0x2b5 [iwlagn]
> [   51.912414]  [<ffffffff81068904>] ? trace_hardirqs_on_caller+0xfa/0x13f
> [   51.912422]  [<ffffffffa024c3ac>] iwl_irq_tasklet+0x2bb/0x3c0 [iwlagn]
> [   51.912429]  [<ffffffff810411f3>] tasklet_action+0xa7/0x10f
> [   51.912435]  [<ffffffff81042205>] __do_softirq+0x144/0x252
> [   51.912442]  [<ffffffff81003a8c>] call_softirq+0x1c/0x34
> [   51.912447]  [<ffffffff810050e4>] do_softirq+0x38/0x80
> [   51.912452]  [<ffffffff81041cd2>] irq_exit+0x45/0x94
> [   51.912457]  [<ffffffff81004829>] do_IRQ+0xad/0xc4
> [   51.912463]  [<ffffffff810cbbd3>] ? might_fault+0x63/0xb3
> [   51.912470]  [<ffffffff813cfb93>] ret_from_intr+0x0/0xf
> [   51.912474]  <EOI>  [<ffffffff810cbbd3>] ? might_fault+0x63/0xb3
> [   51.912484]  [<ffffffff8106a75d>] ? lock_release+0x208/0x215
> [   51.912490]  [<ffffffff810cbc1c>] might_fault+0xac/0xb3
> [   51.912495]  [<ffffffff810cbbd3>] ? might_fault+0x63/0xb3
> [   51.912501]  [<ffffffff812025e3>] __clear_user+0x15/0x59
> [   51.912508]  [<ffffffff8100b2bc>] save_i387_xstate+0x9c/0x1bc
> [   51.912515]  [<ffffffff81002276>] do_signal+0x240/0x686
> [   51.912521]  [<ffffffff81002b9c>] ? sysret_check+0x27/0x62
> [   51.912527]  [<ffffffff8106891e>] ? trace_hardirqs_on_caller+0x114/0x13f
> [   51.912533]  [<ffffffff813cec80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [   51.912539]  [<ffffffff810026e3>] do_notify_resume+0x27/0x5f
> [   51.912545]  [<ffffffff813cec80>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [   51.912551]  [<ffffffff81002e86>] int_signal+0x12/0x17

This is a repeat from last time that confused me at the time.  I could
do a hacky "fix" by putting an RCU read-side critical section around
the for_each_sta_info() in ieee80211_find_sta_by_hw(), but I do not
understand this code well enough to feel comfortable doing so.

Johannes, any enlightenment?

> [   51.929529] [ INFO: suspicious rcu_dereference_check() usage. ]
> [   51.929532] ---------------------------------------------------
> [   51.929536] net/mac80211/sta_info.c:886 invoked
> rcu_dereference_check() without protection!
> [   51.929540]
> [   51.929541] other info that might help us debug this:
> [   51.929542]
> [   51.929545]
> [   51.929546] rcu_scheduler_active = 1, debug_locks = 1
> [   51.929550] 1 lock held by Xorg/4013:
> [   51.929553]  #0:  (clock-AF_UNIX){++.+..}, at: [<ffffffff8133cebd>]
> sock_def_readable+0x19/0x62
> [   51.929567]
> [   51.929568] stack backtrace:
> [   51.929573] Pid: 4013, comm: Xorg Not tainted 2.6.34-rc5-git3 #22
> [   51.929576] Call Trace:
> [   51.929579]  <IRQ>  [<ffffffff81067fbe>] lockdep_rcu_dereference+0x9d/0xa5
> [   51.929603]  [<ffffffffa014f9fe>]
> ieee80211_find_sta_by_hw+0x96/0x10f [mac80211]
> [   51.929615]  [<ffffffffa014fa8e>] ieee80211_find_sta+0x17/0x19 [mac80211]
> [   51.929631]  [<ffffffffa01e50f2>] iwl_tx_queue_reclaim+0xdb/0x1b1 [iwlcore]
> [   51.929642]  [<ffffffffa0252f1c>] iwl5000_rx_reply_tx+0x4a9/0x556 [iwlagn]
> [   51.929649]  [<ffffffff81068685>] ? mark_held_locks+0x52/0x70
> [   51.929656]  [<ffffffff813cf46c>] ? _raw_spin_unlock_irqrestore+0x3a/0x69
> [   51.929662]  [<ffffffff8120a353>] ? is_swiotlb_buffer+0x2e/0x3b
> [   51.929671]  [<ffffffffa024bbf4>] iwl_rx_handle+0x163/0x2b5 [iwlagn]
> [   51.929680]  [<ffffffffa024c3ac>] iwl_irq_tasklet+0x2bb/0x3c0 [iwlagn]
> [   51.929687]  [<ffffffff810411f3>] tasklet_action+0xa7/0x10f
> [   51.929693]  [<ffffffff81042205>] __do_softirq+0x144/0x252
> [   51.929700]  [<ffffffff81003a8c>] call_softirq+0x1c/0x34
> [   51.929705]  [<ffffffff810050e4>] do_softirq+0x38/0x80
> [   51.929711]  [<ffffffff81041cd2>] irq_exit+0x45/0x94
> [   51.929717]  [<ffffffff81019b10>] smp_apic_timer_interrupt+0x87/0x95
> [   51.929724]  [<ffffffff81003553>] apic_timer_interrupt+0x13/0x20
> [   51.929727]  <EOI>  [<ffffffff813cf46e>] ?
> _raw_spin_unlock_irqrestore+0x3c/0x69
> [   51.929739]  [<ffffffff8102d3fb>] __wake_up_sync_key+0x49/0x52
> [   51.929745]  [<ffffffff8133cee7>] sock_def_readable+0x43/0x62
> [   51.929751]  [<ffffffff813b1c61>] unix_stream_sendmsg+0x243/0x2e2
> [   51.929758]  [<ffffffff8133b912>] ? sock_aio_write+0x0/0xcf
> [   51.929764]  [<ffffffff81339342>] __sock_sendmsg+0x59/0x64
> [   51.929770]  [<ffffffff8133b9cd>] sock_aio_write+0xbb/0xcf
> [   51.929777]  [<ffffffff810e9909>] do_sync_readv_writev+0xbc/0xfb
> [   51.929785]  [<ffffffff811c1792>] ? selinux_file_permission+0xa2/0xaf
> [   51.929790]  [<ffffffff810e9690>] ? copy_from_user+0x2a/0x2c
> [   51.929797]  [<ffffffff811baff1>] ? security_file_permission+0x11/0x13
> [   51.929804]  [<ffffffff810ea6a6>] do_readv_writev+0xa2/0x122
> [   51.929810]  [<ffffffff810ead93>] ? fcheck_files+0x8f/0xc9
> [   51.929816]  [<ffffffff810ea764>] vfs_writev+0x3e/0x49
> [   51.929821]  [<ffffffff810ea84a>] sys_writev+0x45/0x8e
> [   51.929828]  [<ffffffff81002b6b>] system_call_fastpath+0x16/0x1b

Ditto.

						Thanx, Paul

------------------------------------------------------------------------

commit 0868dd631def762ba00c2f0f397a53c5cdf24ae2
Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Date:   Sat Apr 24 19:23:30 2010 -0700

    block-cgroup: fix RCU-lockdep splat in blkiocg_add_blkio_group()
    
    It is necessary to be in an RCU read-side critical section when invoking
    css_id(), so this patch adds one to blkiocg_add_blkio_group().  This is
    actually a false positive, because this is called at initialization time,
    and hence always refers to the root cgroup, which cannot go away.
    
    Located-by: Miles Lane <miles.lane@...il.com>
    Suggested-by: Vivek Goyal <vgoyal@...hat.com>
    Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 5fe03de..55c8c73 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -71,7 +71,9 @@ void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
 
 	spin_lock_irqsave(&blkcg->lock, flags);
 	rcu_assign_pointer(blkg->key, key);
+	rcu_read_lock();
 	blkg->blkcg_id = css_id(&blkcg->css);
+	rcu_read_unlock();
 	hlist_add_head_rcu(&blkg->blkcg_node, &blkcg->blkg_list);
 	spin_unlock_irqrestore(&blkcg->lock, flags);
 #ifdef CONFIG_DEBUG_BLK_CGROUP
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/