[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YjRZY6CXOJPCAoNK@linutronix.de>
Date: Fri, 18 Mar 2022 11:05:23 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Saeed Mahameed <saeed@...nel.org>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().
On 2022-03-17 12:21:45 [-0700], Saeed Mahameed wrote:
> On 11 Mar 16:03, Sebastian Andrzej Siewior wrote:
> > ____napi_schedule() needs to be invoked with disabled interrupts due to
> > __raise_softirq_irqoff (in order not to corrupt the per-CPU list).
>
> This patch is likely causing the following call trace when RPS is enabled:
>
> [ 690.429122] WARNING: CPU: 0 PID: 0 at net/core/dev.c:4268 rps_trigger_softirq+0x21/0xb0
> [ 690.431236] Modules linked in: bonding ib_ipoib ipip tunnel4 geneve ib_umad ip6_gre ip6_tunnel tunnel6 rdma_ucm nf_tables ip_gre gre mlx5_ib ib_uverbs mlx5_core iptable_raw openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink rpcrdma ib_iser xt_addrtype libiscsi scsi_transport_iscsi rdma_cm iw_cm iptable_nat nf_nat ib_cm br_netfilter ib_core overlay fuse [last unloaded: ib_uverbs]
> [ 690.439693] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc7_net_next_4303f9c #1
> [ 690.441709] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [ 690.444587] RIP: 0010:rps_trigger_softirq+0x21/0xb0
> [ 690.445971] Code: ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 b1 ea 21 01 53 48 89 fb 85 c0 74 1b 65 8b 05 16 56 71 7e a9 00 ff 0f 00 75 02 <0f> 0b 65 8b 05 4e 5f 72 7e 85 c0 74 5b 48 8b 83 e0 01 00 00 f6 c4
> [ 690.450682] RSP: 0018:ffffffff82803e70 EFLAGS: 00010046
> [ 690.452106] RAX: 0000000000000001 RBX: ffff88852ca3d400 RCX: ffff88852ca3d540
> [ 690.453972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88852ca3d400
> [ 690.455860] RBP: 0000000000000000 R08: ffff88852ca3d400 R09: 0000000000000001
> [ 690.457684] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> [ 690.459548] R13: ffff88852ca3d540 R14: ffffffff82829628 R15: 0000000000000000
> [ 690.461429] FS: 0000000000000000(0000) GS:ffff88852ca00000(0000) knlGS:0000000000000000
> [ 690.463653] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 690.465180] CR2: 00007ff718200b98 CR3: 000000013b4de003 CR4: 0000000000370eb0
> [ 690.467022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 690.468915] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 690.470742] Call Trace:
> [ 690.471544] <TASK>
> [ 690.472242] flush_smp_call_function_queue+0xe5/0x1e0
> [ 690.473639] flush_smp_call_function_from_idle+0x5f/0xa0
>
>
> For some reason - that i haven't looked into yet -
> net_rps_send_ipi() will eventually ____napi_schedule()
> only after enabling IRQ. see net_rps_action_and_irq_enable()
Perfect. There is a do_softirq() in flush_smp_call_function_from_idle()
it so it fine.
PeterZ any idea in how to shut lockdep here? Playing with the preemption
counter will do the trickā¦ I'm worried that by disabling BH
unconditionally here it will need to be done by the upper caller and
this in turn will force a BH-flush on PREEMPT_RT. While it looks
harmless in the idle case, it looks bad for migration_cpu_stop().
Sebastian
Powered by blists - more mailing lists