lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 17 Jul 2018 21:27:50 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Tariq Toukan <tariqt@...lanox.com>
Cc:     "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        Eran Ben Elisha <eranbe@...lanox.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>
Subject: Re: [PATCH net] net/xdp: Fix suspicious RCU usage warning

On 07/17/2018 06:47 PM, Alexei Starovoitov wrote:
> On Tue, Jul 17, 2018 at 06:10:38PM +0300, Tariq Toukan wrote:
>> Fix the warning below by calling rhashtable_lookup under
>> RCU read lock.
>>
>> [  342.450870] WARNING: suspicious RCU usage
>> [  342.455856] 4.18.0-rc2+ #17 Tainted: G           O
>> [  342.462210] -----------------------------
>> [  342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
>> [  342.476568]
>> [  342.476568] other info that might help us debug this:
>> [  342.476568]
>> [  342.486978]
>> [  342.486978] rcu_scheduler_active = 2, debug_locks = 1
>> [  342.495211] 4 locks held by modprobe/3934:
>> [  342.500265]  #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
>> mlx5_unregister_interface+0x18/0x90 [mlx5_core]
>> [  342.511953]  #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
>> [  342.521109]  #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
>> [mlx5_core]
>> [  342.531642]  #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
>> [  342.541206]
>> [  342.541206] stack backtrace:
>> [  342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G           O      4.18.0-rc2+ #17
>> [  342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
>> [  342.565606] Call Trace:
>> [  342.568861]  dump_stack+0x78/0xb3
>> [  342.573086]  xdp_rxq_info_unreg+0x3f5/0x6b0
>> [  342.578285]  ? __call_rcu+0x220/0x300
>> [  342.582911]  mlx5e_free_rq+0x38/0xc0 [mlx5_core]
>> [  342.588602]  mlx5e_close_channel+0x20/0x120 [mlx5_core]
>> [  342.594976]  mlx5e_close_channels+0x26/0x40 [mlx5_core]
>> [  342.601345]  mlx5e_close_locked+0x44/0x50 [mlx5_core]
>> [  342.607519]  mlx5e_close+0x42/0x60 [mlx5_core]
>> [  342.613005]  __dev_close_many+0xb1/0x120
>> [  342.617911]  dev_close_many+0xa2/0x170
>> [  342.622622]  rollback_registered_many+0x148/0x460
>> [  342.628401]  ? __lock_acquire+0x48d/0x11b0
>> [  342.633498]  ? unregister_netdev+0xe/0x20
>> [  342.638495]  rollback_registered+0x56/0x90
>> [  342.643588]  unregister_netdevice_queue+0x7e/0x100
>> [  342.649461]  unregister_netdev+0x18/0x20
>> [  342.654362]  mlx5e_remove+0x2a/0x50 [mlx5_core]
>> [  342.659944]  mlx5_remove_device+0xe5/0x110 [mlx5_core]
>> [  342.666208]  mlx5_unregister_interface+0x39/0x90 [mlx5_core]
>> [  342.673038]  cleanup+0x5/0xbfc [mlx5_core]
>> [  342.678094]  __x64_sys_delete_module+0x16b/0x240
>> [  342.683725]  ? do_syscall_64+0x1c/0x210
>> [  342.688476]  do_syscall_64+0x5a/0x210
>> [  342.693025]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>
>> Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
>> Signed-off-by: Tariq Toukan <tariqt@...lanox.com>
>> Cc: Jesper Dangaard Brouer <brouer@...hat.com>
>> ---
>>  net/core/xdp.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index 9d1f22072d5d..c20fefbfb76c 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -102,7 +102,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
>>  
>>  	mutex_lock(&mem_id_lock);
>>  
>> +	rcu_read_lock();
>>  	xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
>> +	rcu_read_unlock();
>>  	if (!xa) {
> 
> if it's an actual bug rcu_read_unlock seems to be misplaced.
> It silences the warn, but rcu section looks wrong.

I think that whole piece in __xdp_rxq_info_unreg_mem_model() should be:

  mutex_lock(&mem_id_lock);
  xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
  if (xa && rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params) == 0)
          call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
  mutex_unlock(&mem_id_lock);

Technically the RCU read side plus rhashtable_lookup() is the same, but lets
use proper api. From the doc (https://lwn.net/Articles/751374/) object removal
is wrapped around the RCU read side additionally, but in our case we're behind
mem_id_lock for insertion/removal serialization.

Cheers,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ