[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20cd265b-d52d-fd1f-c47e-bfa7ea15518f@gmail.com>
Date: Sat, 5 Jun 2021 19:16:14 +0200
From: Oliver Herms <oliver.peter.herms@...il.com>
To: Network Development <netdev@...r.kernel.org>
Cc: David Miller <davem@...emloft.net>, David Ahern <dsahern@...il.com>
Subject: VRF/IPv4/ARP: unregister_netdevice waiting for dev to become free ->
Who's responsible for releasing dst_entry created by ip_route_input_noref?
Hi everyone,
I'm observing an device unregistration issue when I try to delete a VRF interface after using the VRF.
The issue is reproducible on 5.12.9, 5.10.24, 5.11.0-18 (debian).
Here are the steps to reproduce the issue:
ip addr add 10.0.0.1/32 dev lo
ip netns add test-ns
ip link add veth-outside type veth peer name veth-inside
ip link add vrf-100 type vrf table 1100
ip link set veth-outside master vrf-100
ip link set veth-inside netns test-ns
ip link set veth-outside up
ip link set vrf-100 up
ip route add 10.1.1.1/32 dev veth-outside table 1100
ip netns exec test-ns ip link set veth-inside up
ip netns exec test-ns ip addr add 10.1.1.1/32 dev veth-inside
ip netns exec test-ns ip route add 10.0.0.1/32 dev veth-inside
ip netns exec test-ns ip route add default via 10.0.0.1
ip netns exec test-ns ping 10.0.0.1 -c 1 -i 1
sleep 10
ip link set veth-outside nomaster
ip link set vrf-100 down
ip link delete vrf-100 <= Never returns
The issue does not happen when I don't do the ping.
I've tracked down all calls to dev_hold and dev_put.
When the ping command is run there is the following call to dev_hold to which the corresponding dev_put seems to be missing (doesn't even happen when the VRF is set down or deleted):
[ 284.528775] CPU: 2 PID: 1205 Comm: ping Not tainted 5.12.9 #1
[ 284.528790] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 284.528796] Call Trace:
[ 284.528802] <IRQ>
[ 284.528832] dump_stack+0x7d/0x9c
[ 284.528854] dst_alloc.cold+0x11/0x2a
[ 284.528866] rt_dst_alloc+0x48/0xd0
[ 284.528881] ip_route_input_slow+0x507/0xc80
[ 284.528900] ip_route_input_rcu+0x258/0x270
[ 284.528913] ip_route_input_noref+0x2a/0x50
[ 284.528923] arp_process+0x4da/0x8a0
[ 284.528938] arp_rcv+0x1a9/0x1d0
[ 284.528948] ? trigger_load_balance+0x205/0x240
[ 284.528961] __netif_receive_skb_one_core+0x8d/0xa0
[ 284.528974] __netif_receive_skb+0x18/0x60
[ 284.528984] process_backlog+0xa2/0x170
[ 284.528993] __napi_poll+0x31/0x170
[ 284.529002] net_rx_action+0x22f/0x280
[ 284.529012] __do_softirq+0xce/0x281
[ 284.529024] do_softirq+0x77/0xa0
[ 284.529049] </IRQ>
[ 284.529054] __local_bh_enable_ip+0x50/0x60
[ 284.529064] ip_finish_output2+0x1ab/0x590
[ 284.529073] ? __cgroup_bpf_run_filter_skb+0x3ce/0x3e0
[ 284.529086] __ip_finish_output+0x110/0x270
[ 284.529096] ip_finish_output+0x2d/0xb0
[ 284.529105] ip_output+0x78/0x100
[ 284.529114] ? __ip_finish_output+0x270/0x270
[ 284.529122] ip_push_pending_frames+0xa3/0xb0
[ 284.529131] raw_sendmsg+0x5f0/0xdb0
[ 284.529144] ? setup_min_slab_ratio+0x68/0x90
[ 284.529182] ? __cond_resched+0x1a/0x50
[ 284.529195] ? aa_sk_perm+0x43/0x1b0
[ 284.529211] inet_sendmsg+0x6c/0x70
[ 284.529221] sock_sendmsg+0x5e/0x70
[ 284.529234] __sys_sendto+0x113/0x190
[ 284.529249] ? handle_mm_fault+0xda/0x2c0
[ 284.529258] ? do_user_addr_fault+0x1f5/0x670
[ 284.529266] ? exit_to_user_mode_prepare+0x37/0x190
[ 284.529277] __x64_sys_sendto+0x29/0x30
[ 284.529287] do_syscall_64+0x38/0x90
[ 284.529298] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 284.529306] RIP: 0033:0x7f89f02db53a
[ 284.529317] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
[ 284.529325] RSP: 002b:00007ffd7c1b0478 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 284.529335] RAX: ffffffffffffffda RBX: 00007ffd7c1b1c00 RCX: 00007f89f02db53a
[ 284.529340] RDX: 0000000000000040 RSI: 00005592d86be100 RDI: 0000000000000003
[ 284.529345] RBP: 00005592d86be100 R08: 00007ffd7c1b3e7c R09: 0000000000000010
[ 284.529349] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
[ 284.529354] R13: 00007ffd7c1b1bc0 R14: 00007ffd7c1b0480 R15: 0000001d00000001
Processing the incoming ARP request causes a call to ip_route_input_noref => ip_route_input_rcu => ip_route_input_slow => rt_dst_alloc => dst_alloc => dev_hold.
In a non VRF use-case the dst->dev would be the loopback interface that is never deleted. In the VRF use-case dst->dev is the VRF interface. And that one I would like to delete.
I've tracked down that dst_release() would call dev_put() but it seems dst_release is not called here (but should be I guess?). Thus, causing a dst_entry leak that causes the VRF device to be unremovable.
At least that's what it looks like to me.
So: Who's responsible for releasing dst_entry created by ip_route_input_noref in arp_process?
Kind Regards
Oliver
Powered by blists - more mailing lists