[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251230120352.GA518022@shredder>
Date: Tue, 30 Dec 2025 14:03:52 +0200
From: Ido Schimmel <idosch@...dia.com>
To: Alexandre Knecht <knecht.alexandre@...il.com>
Cc: Nikolay Aleksandrov <razor@...ckwall.org>, netdev@...r.kernel.org,
roopa@...dia.com
Subject: Re: [PATCH net] bridge: fix C-VLAN preservation in 802.1ad
vlan_tunnel egress
On Tue, Dec 30, 2025 at 12:25:39PM +0100, Alexandre Knecht wrote:
> Hi Ido, Nik, thanks for the review. I'm sorry for the missing
> maintainers, I felt so bad when I saw the bot result after posting,
> will not happen ever again.
>
> > It's not clear from the commit message why 802.1Q bridges are
> > unaffected. I think you mean that in 802.1Q bridges the C-VLAN is never
> > in the payload and always in hwaccel (even when Tx VLAN offload is
> > disabled, thanks to commit 12464bb8de021).
>
> You're right that I should clarify, but I think we may be talking
> about different scenarios.
>
> The 802.1Q bridge I mentioned is specifically the case with
> vlan_tunnel enabled (for VXLAN bridging). In this setup:
>
> 1. The bridge uses per-vlan br_tunnel_info to map C-VLAN (0x8100) to VNI
> 2. On egress via br_handle_egress_vlan_tunnel(), the C-VLAN is cleared
> from hwaccel and used to set the tunnel_id in dst_metadata
> 3. After skb_vlan_pop() clears hwaccel, skb->protocol is the actual
> payload type (e.g., IPv4), not a VLAN ethertype, so the second phase
> of skb_vlan_pop() that pops nested VLANs from payload never triggers
>
> So in this 802.1Q + vlan_tunnel path, there's no C-VLAN in hwaccel at
> the VXLAN driver entry point.
>
> For 802.1ad bridges, here is what I understood of the flow at egress
> in br_handle_egress_vlan_tunnel():
>
> // 1. S-VLAN is in hwaccel at this point (bridge put it there)
> tunnel_id = READ_ONCE(vlan->tinfo.tunnel_id);
> if (!tunnel_id || unlikely(!skb_vlan_tag_present(skb))) // ← checks hwaccel!
> return 0;
>
> // 2. Set tunnel metadata (VNI) : this goes in skb_dst, NOT hwaccel
> skb_dst_drop(skb);
> // ... creates metadata_dst with tunnel_id ...
> skb_dst_set(skb, &tunnel_dst->dst);
>
> // 3. Now remove the S-VLAN from hwaccel (it's been "consumed" for VNI lookup)
> err = skb_vlan_pop(skb); // ← BUG: also pops C-VLAN from payload
>
> So at the moment skb_vlan_pop() is called:
> - S-VLAN: in hwaccel (skb->vlan_tci) : needs to be cleared
> - C-VLAN: in packet payload : should NOT be touched
> - VNI: already set in skb_dst() metadata : separate from VLAN
>
> The S-VLAN was used to lookup the VNI, but it's still sitting in
> hwaccel and needs to be removed before the packet goes into the VXLAN
> tunnel. The problem is skb_vlan_pop() does more than just clear
> hwaccel, it also "normalizes" any nested VLAN it finds in the payload.
> The fix uses __vlan_hwaccel_clear_tag() instead, which only clears
> hwaccel without touching the payload.
I agree with the analysis and the fix looks correct to me. Also tested
with test_vxlan_vnifiltering.sh which tests the 802.1Q scenario.
Powered by blists - more mailing lists