lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABXGCsND1HmAjCZNS_fg59_qbQfxfcHCD_OYD2tjTYdWFDSajw@mail.gmail.com>
Date: Thu, 20 Jun 2024 11:55:05 +0500
From: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To: kadlec@...filter.org, nnamrec@...il.com, 
	Pablo Neira Ayuso <pablo@...filter.org>, 
	Linux List Kernel Mailing <linux-kernel@...r.kernel.org>, 
	Linux regressions mailing list <regressions@...ts.linux.dev>, netdev@...r.kernel.org
Subject: 6.10/bisected/regression - commits 4e7aaa6b82d6 cause appearing
 WARNING at net/netfilter/ipset/ip_set_core.c:1200 suspicious
 rcu_dereference_protected() usage

Hi,
between 2ef5971ff345 and rc4 I spotted a new regression.
It is expressed in the appearance warning with stacktrace after one
minute after boot.

 =============================
 WARNING: suspicious RCU usage
 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug #1 Tainted:
G        W    L    -------  ---
 -----------------------------
 net/netfilter/ipset/ip_set_core.c:1200 suspicious
rcu_dereference_protected() usage!

 other info that might help us debug this:


 rcu_scheduler_active = 2, debug_locks = 1
 3 locks held by kworker/u128:1/264:
  #0: ffff88810813c958 ((wq_completion)netns){+.+.}-{0:0}, at:
process_one_work+0xeab/0x1460
  #1: ffffc90001477da0 (net_cleanup_work){+.+.}-{0:0}, at:
process_one_work+0x82b/0x1460
  #2: ffffffff97d9ae98 (pernet_ops_rwsem){++++}-{3:3}, at:
cleanup_net+0xb9/0xa90

 stack backtrace:
 CPU: 30 PID: 264 Comm: kworker/u128:1 Tainted: G        W    L
-------  ---  6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug
#1
 Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
WIFI, BIOS 2611 04/07/2024
 Workqueue: netns cleanup_net
 Call Trace:
  <TASK>
  dump_stack_lvl+0x84/0xd0
  lockdep_rcu_suspicious.cold+0xa1/0x134
  _destroy_all_sets+0x1c7/0x560 [ip_set]
  ip_set_net_exit+0x20/0x50 [ip_set]
  ops_exit_list+0x99/0x170
  cleanup_net+0x4d9/0xa90
  ? __pfx_cleanup_net+0x10/0x10
  process_one_work+0x8a4/0x1460
  ? worker_thread+0xe3/0x1010
  ? __pfx_process_one_work+0x10/0x10
  ? assign_work+0x16c/0x240
  worker_thread+0x5e6/0x1010
  ? __kthread_parkme+0xb1/0x1d0
  ? __pfx_worker_thread+0x10/0x10
  ? __pfx_worker_thread+0x10/0x10
  kthread+0x2d2/0x3a0
  ? _raw_spin_unlock_irq+0x28/0x60
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x31/0x70
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1a/0x30
  </TASK>

 =============================
 WARNING: suspicious RCU usage
 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug #1 Tainted:
G        W    L    -------  ---
 -----------------------------
 net/netfilter/ipset/ip_set_core.c:1211 suspicious
rcu_dereference_protected() usage!

 other info that might help us debug this:


 rcu_scheduler_active = 2, debug_locks = 1
 3 locks held by kworker/u128:1/264:
  #0: ffff88810813c958 ((wq_completion)netns){+.+.}-{0:0}, at:
process_one_work+0xeab/0x1460
  #1: ffffc90001477da0 (net_cleanup_work){+.+.}-{0:0}, at:
process_one_work+0x82b/0x1460
  #2: ffffffff97d9ae98 (pernet_ops_rwsem){++++}-{3:3}, at:
cleanup_net+0xb9/0xa90

 stack backtrace:
 CPU: 30 PID: 264 Comm: kworker/u128:1 Tainted: G        W    L
-------  ---  6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug
#1
 Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
WIFI, BIOS 2611 04/07/2024
 Workqueue: netns cleanup_net
 Call Trace:
  <TASK>
  dump_stack_lvl+0x84/0xd0
  lockdep_rcu_suspicious.cold+0xa1/0x134
  _destroy_all_sets+0x3a8/0x560 [ip_set]
  ip_set_net_exit+0x20/0x50 [ip_set]
  ops_exit_list+0x99/0x170
  cleanup_net+0x4d9/0xa90
  ? __pfx_cleanup_net+0x10/0x10
  process_one_work+0x8a4/0x1460
  ? worker_thread+0xe3/0x1010
  ? __pfx_process_one_work+0x10/0x10
  ? assign_work+0x16c/0x240
  worker_thread+0x5e6/0x1010
  ? __kthread_parkme+0xb1/0x1d0
  ? __pfx_worker_thread+0x10/0x10
  ? __pfx_worker_thread+0x10/0x10
  kthread+0x2d2/0x3a0
  ? _raw_spin_unlock_irq+0x28/0x60
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x31/0x70
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1a/0x30
  </TASK>
 workqueue: gc_worker [nf_conntrack] hogged CPU for >10000us 7 times,
consider switching to WQ_UNBOUND
 workqueue: gc_worker [nf_conntrack] hogged CPU for >10000us 11 times,
consider switching to WQ_UNBOUND

Bisect blame this commit:
commit 4e7aaa6b82d63e8ddcbfb56b4fd3d014ca586f10
Author: Jozsef Kadlecsik <kadlec@...filter.org>
Date:   Tue Jun 4 15:58:03 2024 +0200

    netfilter: ipset: Fix race between namespace cleanup and gc in the
list:set type

    Lion Ackermann reported that there is a race condition between
namespace cleanup
    in ipset and the garbage collection of the list:set type. The namespace
    cleanup can destroy the list:set type of sets while the gc of the
set type is
    waiting to run in rcu cleanup. The latter uses data from the
destroyed set which
    thus leads use after free. The patch contains the following parts:

    - When destroying all sets, first remove the garbage collectors, then wait
      if needed and then destroy the sets.
    - Fix the badly ordered "wait then remove gc" for the destroy a single set
      case.
    - Fix the missing rcu locking in the list:set type in the userspace test
      case.
    - Use proper RCU list handlings in the list:set type.

    The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list
flush to cancel_gc).

    Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression
in swap operation)
    Reported-by: Lion Ackermann <nnamrec@...il.com>
    Tested-by: Lion Ackermann <nnamrec@...il.com>
    Signed-off-by: Jozsef Kadlecsik <kadlec@...filter.org>
    Signed-off-by: Pablo Neira Ayuso <pablo@...filter.org>

 net/netfilter/ipset/ip_set_core.c     | 81
++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
 net/netfilter/ipset/ip_set_list_set.c | 30 ++++++++++++++----------------
 2 files changed, 60 insertions(+), 51 deletions(-)

And I can confirm after reverting 4e7aaa6b82d6 the issue is gone.

I also attach the build config and full kernel log.

My hardware specs: https://linux-hardware.org/?probe=80512f0c04

Jozsef can you look into this please?

-- 
Best Regards,
Mike Gavrilov.

Download attachment "dmesg-6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug.zip" of type "application/zip" (52668 bytes)

Download attachment ".config.zip" of type "application/zip" (66526 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ