[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABXGCsND1HmAjCZNS_fg59_qbQfxfcHCD_OYD2tjTYdWFDSajw@mail.gmail.com>
Date: Thu, 20 Jun 2024 11:55:05 +0500
From: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To: kadlec@...filter.org, nnamrec@...il.com,
Pablo Neira Ayuso <pablo@...filter.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Linux regressions mailing list <regressions@...ts.linux.dev>, netdev@...r.kernel.org
Subject: 6.10/bisected/regression - commits 4e7aaa6b82d6 cause appearing
WARNING at net/netfilter/ipset/ip_set_core.c:1200 suspicious
rcu_dereference_protected() usage
Hi,
between 2ef5971ff345 and rc4 I spotted a new regression.
It is expressed in the appearance warning with stacktrace after one
minute after boot.
=============================
WARNING: suspicious RCU usage
6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug #1 Tainted:
G W L ------- ---
-----------------------------
net/netfilter/ipset/ip_set_core.c:1200 suspicious
rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by kworker/u128:1/264:
#0: ffff88810813c958 ((wq_completion)netns){+.+.}-{0:0}, at:
process_one_work+0xeab/0x1460
#1: ffffc90001477da0 (net_cleanup_work){+.+.}-{0:0}, at:
process_one_work+0x82b/0x1460
#2: ffffffff97d9ae98 (pernet_ops_rwsem){++++}-{3:3}, at:
cleanup_net+0xb9/0xa90
stack backtrace:
CPU: 30 PID: 264 Comm: kworker/u128:1 Tainted: G W L
------- --- 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug
#1
Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
WIFI, BIOS 2611 04/07/2024
Workqueue: netns cleanup_net
Call Trace:
<TASK>
dump_stack_lvl+0x84/0xd0
lockdep_rcu_suspicious.cold+0xa1/0x134
_destroy_all_sets+0x1c7/0x560 [ip_set]
ip_set_net_exit+0x20/0x50 [ip_set]
ops_exit_list+0x99/0x170
cleanup_net+0x4d9/0xa90
? __pfx_cleanup_net+0x10/0x10
process_one_work+0x8a4/0x1460
? worker_thread+0xe3/0x1010
? __pfx_process_one_work+0x10/0x10
? assign_work+0x16c/0x240
worker_thread+0x5e6/0x1010
? __kthread_parkme+0xb1/0x1d0
? __pfx_worker_thread+0x10/0x10
? __pfx_worker_thread+0x10/0x10
kthread+0x2d2/0x3a0
? _raw_spin_unlock_irq+0x28/0x60
? __pfx_kthread+0x10/0x10
ret_from_fork+0x31/0x70
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
=============================
WARNING: suspicious RCU usage
6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug #1 Tainted:
G W L ------- ---
-----------------------------
net/netfilter/ipset/ip_set_core.c:1211 suspicious
rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by kworker/u128:1/264:
#0: ffff88810813c958 ((wq_completion)netns){+.+.}-{0:0}, at:
process_one_work+0xeab/0x1460
#1: ffffc90001477da0 (net_cleanup_work){+.+.}-{0:0}, at:
process_one_work+0x82b/0x1460
#2: ffffffff97d9ae98 (pernet_ops_rwsem){++++}-{3:3}, at:
cleanup_net+0xb9/0xa90
stack backtrace:
CPU: 30 PID: 264 Comm: kworker/u128:1 Tainted: G W L
------- --- 6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug
#1
Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
WIFI, BIOS 2611 04/07/2024
Workqueue: netns cleanup_net
Call Trace:
<TASK>
dump_stack_lvl+0x84/0xd0
lockdep_rcu_suspicious.cold+0xa1/0x134
_destroy_all_sets+0x3a8/0x560 [ip_set]
ip_set_net_exit+0x20/0x50 [ip_set]
ops_exit_list+0x99/0x170
cleanup_net+0x4d9/0xa90
? __pfx_cleanup_net+0x10/0x10
process_one_work+0x8a4/0x1460
? worker_thread+0xe3/0x1010
? __pfx_process_one_work+0x10/0x10
? assign_work+0x16c/0x240
worker_thread+0x5e6/0x1010
? __kthread_parkme+0xb1/0x1d0
? __pfx_worker_thread+0x10/0x10
? __pfx_worker_thread+0x10/0x10
kthread+0x2d2/0x3a0
? _raw_spin_unlock_irq+0x28/0x60
? __pfx_kthread+0x10/0x10
ret_from_fork+0x31/0x70
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
workqueue: gc_worker [nf_conntrack] hogged CPU for >10000us 7 times,
consider switching to WQ_UNBOUND
workqueue: gc_worker [nf_conntrack] hogged CPU for >10000us 11 times,
consider switching to WQ_UNBOUND
Bisect blame this commit:
commit 4e7aaa6b82d63e8ddcbfb56b4fd3d014ca586f10
Author: Jozsef Kadlecsik <kadlec@...filter.org>
Date: Tue Jun 4 15:58:03 2024 +0200
netfilter: ipset: Fix race between namespace cleanup and gc in the
list:set type
Lion Ackermann reported that there is a race condition between
namespace cleanup
in ipset and the garbage collection of the list:set type. The namespace
cleanup can destroy the list:set type of sets while the gc of the
set type is
waiting to run in rcu cleanup. The latter uses data from the
destroyed set which
thus leads use after free. The patch contains the following parts:
- When destroying all sets, first remove the garbage collectors, then wait
if needed and then destroy the sets.
- Fix the badly ordered "wait then remove gc" for the destroy a single set
case.
- Fix the missing rcu locking in the list:set type in the userspace test
case.
- Use proper RCU list handlings in the list:set type.
The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list
flush to cancel_gc).
Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression
in swap operation)
Reported-by: Lion Ackermann <nnamrec@...il.com>
Tested-by: Lion Ackermann <nnamrec@...il.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@...filter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@...filter.org>
net/netfilter/ipset/ip_set_core.c | 81
++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
net/netfilter/ipset/ip_set_list_set.c | 30 ++++++++++++++----------------
2 files changed, 60 insertions(+), 51 deletions(-)
And I can confirm after reverting 4e7aaa6b82d6 the issue is gone.
I also attach the build config and full kernel log.
My hardware specs: https://linux-hardware.org/?probe=80512f0c04
Jozsef can you look into this please?
--
Best Regards,
Mike Gavrilov.
Download attachment "dmesg-6.10.0-0.rc4.20240618git14d7c92f8df9.40.fc41.x86_64+debug.zip" of type "application/zip" (52668 bytes)
Download attachment ".config.zip" of type "application/zip" (66526 bytes)
Powered by blists - more mailing lists