[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <be4297e7-e280-4dab-983e-1253e09524c5@blackwall.org>
Date: Tue, 2 Sep 2025 15:16:09 +0300
From: Nikolay Aleksandrov <razor@...ckwall.org>
To: Ido Schimmel <idosch@...dia.com>, netdev@...r.kernel.org
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
edumazet@...gle.com, andrew+netdev@...n.ch, horms@...nel.org,
petrm@...dia.com, mcremers@...udbear.nl
Subject: Re: [PATCH net 1/3] vxlan: Fix NPD when refreshing an FDB entry with
a nexthop object
On 9/1/25 09:50, Ido Schimmel wrote:
> VXLAN FDB entries can point to either a remote destination or an FDB
> nexthop group. The latter is usually used in EVPN deployments where
> learning is disabled.
>
> However, when learning is enabled, an incoming packet might try to
> refresh an FDB entry that points to an FDB nexthop group and therefore
> does not have a remote. Such packets should be dropped, but they are
> only dropped after dereferencing the non-existent remote, resulting in a
> NPD [1] which can be reproduced using [2].
>
> Fix by dropping such packets earlier. Remove the misleading comment from
> first_remote_rcu().
>
> [1]
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> [...]
> CPU: 13 UID: 0 PID: 361 Comm: mausezahn Not tainted 6.17.0-rc1-virtme-g9f6b606b6b37 #1 PREEMPT(voluntary)
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-4.fc41 04/01/2014
> RIP: 0010:vxlan_snoop+0x98/0x1e0
> [...]
> Call Trace:
> <TASK>
> vxlan_encap_bypass+0x209/0x240
> encap_bypass_if_local+0xb1/0x100
> vxlan_xmit_one+0x1375/0x17e0
> vxlan_xmit+0x6b4/0x15f0
> dev_hard_start_xmit+0x5d/0x1c0
> __dev_queue_xmit+0x246/0xfd0
> packet_sendmsg+0x113a/0x1850
> __sock_sendmsg+0x38/0x70
> __sys_sendto+0x126/0x180
> __x64_sys_sendto+0x24/0x30
> do_syscall_64+0xa4/0x260
> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>
> [2]
> #!/bin/bash
>
> ip address add 192.0.2.1/32 dev lo
> ip address add 192.0.2.2/32 dev lo
>
> ip nexthop add id 1 via 192.0.2.3 fdb
> ip nexthop add id 10 group 1 fdb
>
> ip link add name vx0 up type vxlan id 10010 local 192.0.2.1 dstport 12345 localbypass
> ip link add name vx1 up type vxlan id 10020 local 192.0.2.2 dstport 54321 learning
>
> bridge fdb add 00:11:22:33:44:55 dev vx0 self static dst 192.0.2.2 port 54321 vni 10020
> bridge fdb add 00:aa:bb:cc:dd:ee dev vx1 self static nhid 10
>
> mausezahn vx0 -a 00:aa:bb:cc:dd:ee -b 00:11:22:33:44:55 -c 1 -q
>
> Fixes: 1274e1cc4226 ("vxlan: ecmp support for mac fdb entries")
> Reported-by: Marlin Cremers <mcremers@...udbear.nl>
> Reviewed-by: Petr Machata <petrm@...dia.com>
> Signed-off-by: Ido Schimmel <idosch@...dia.com>
> ---
> drivers/net/vxlan/vxlan_core.c | 8 ++++----
> drivers/net/vxlan/vxlan_private.h | 4 +---
> 2 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
> index f32be2e301f2..0f6a7c89a669 100644
> --- a/drivers/net/vxlan/vxlan_core.c
> +++ b/drivers/net/vxlan/vxlan_core.c
> @@ -1445,6 +1445,10 @@ static enum skb_drop_reason vxlan_snoop(struct net_device *dev,
> if (READ_ONCE(f->updated) != now)
> WRITE_ONCE(f->updated, now);
>
> + /* Don't override an fdb with nexthop with a learnt entry */
> + if (rcu_access_pointer(f->nh))
> + return SKB_DROP_REASON_VXLAN_ENTRY_EXISTS;
> +
> if (likely(vxlan_addr_equal(&rdst->remote_ip, src_ip) &&
> rdst->remote_ifindex == ifindex))
> return SKB_NOT_DROPPED_YET;
> @@ -1453,10 +1457,6 @@ static enum skb_drop_reason vxlan_snoop(struct net_device *dev,
> if (f->state & (NUD_PERMANENT | NUD_NOARP))
> return SKB_DROP_REASON_VXLAN_ENTRY_EXISTS;
>
> - /* Don't override an fdb with nexthop with a learnt entry */
> - if (rcu_access_pointer(f->nh))
> - return SKB_DROP_REASON_VXLAN_ENTRY_EXISTS;
> -
> if (net_ratelimit())
> netdev_info(dev,
> "%pM migrated from %pIS to %pIS\n",
> diff --git a/drivers/net/vxlan/vxlan_private.h b/drivers/net/vxlan/vxlan_private.h
> index 6c625fb29c6c..99fe772ad679 100644
> --- a/drivers/net/vxlan/vxlan_private.h
> +++ b/drivers/net/vxlan/vxlan_private.h
> @@ -61,9 +61,7 @@ static inline struct hlist_head *vs_head(struct net *net, __be16 port)
> return &vn->sock_list[hash_32(ntohs(port), PORT_HASH_BITS)];
> }
>
> -/* First remote destination for a forwarding entry.
> - * Guaranteed to be non-NULL because remotes are never deleted.
> - */
> +/* First remote destination for a forwarding entry. */
> static inline struct vxlan_rdst *first_remote_rcu(struct vxlan_fdb *fdb)
> {
> if (rcu_access_pointer(fdb->nh))
Nice catch,
Reviewed-by: Nikolay Aleksandrov <razor@...ckwall.org>
Powered by blists - more mailing lists