linux-kernel - Re: [PATCH] bpf: in bpf_skb_adjust_room correct inner protocol for vxlan

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <8552C5F8-8410-4E81-8AF4-7018878AFCDC@gmail.com>
Date:   Tue, 9 Feb 2021 18:41:41 +0800
From:   黄学森 <hxseverything@...il.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     David Miller <davem@...emloft.net>, bpf <bpf@...r.kernel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Network Development <netdev@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        chengzhiyong <chengzhiyong@...ishou.com>,
        wangli <wangli09@...ishou.com>
Subject: Re: [PATCH] bpf: in bpf_skb_adjust_room correct inner protocol for
 vxlan

Appreciate for your reply Willem!

The original intention of this commit is that when we use bpf_skb_adjust_room  to encapsulate 
Vxlan packets, we find some powerful device features disabled. 

Setting the inner_protocol directly as skb->protocol is the root cause.

I understand that it’s not easy to handle all tunnel protocol in one bpf helper function. But for my
immature idea, when pushing Ethernet header, setting the inner_protocol as ETH_P_TEB may
be better.

Now the flag BPF_F_ADJ_ROOM_ENCAP_L4_UDP includes many udp tunnel types( e.g. 
udp+mpls, geneve, vxlan). Adding an independent flag to represents Vxlan looks a little 
reduplicative. What’s your suggestion?

Thanks again for your reply!



> 2021年2月8日 下午9:06，Willem de Bruijn <willemdebruijn.kernel@...il.com> 写道：
> 
> On Mon, Feb 8, 2021 at 7:16 AM huangxuesen <hxseverything@...il.com> wrote:
>> 
>> From: huangxuesen <huangxuesen@...ishou.com>
>> 
>> When pushing vxlan tunnel header, set inner protocol as ETH_P_TEB in skb
>> to avoid HW device disabling udp tunnel segmentation offload, just like
>> vxlan_build_skb does.
>> 
>> Drivers for NIC may invoke vxlan_features_check to check the
>> inner_protocol in skb for vxlan packets to decide whether to disable
>> NETIF_F_GSO_MASK. Currently it sets inner_protocol as the original
>> skb->protocol, that will make mlx5_core disable TSO and lead to huge
>> performance degradation.
>> 
>> Signed-off-by: huangxuesen <huangxuesen@...ishou.com>
>> Signed-off-by: chengzhiyong <chengzhiyong@...ishou.com>
>> Signed-off-by: wangli <wangli09@...ishou.com>
>> ---
>> net/core/filter.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 255aeee72402..f8d3ba3fe10f 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3466,7 +3466,12 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>                skb->inner_mac_header = inner_net - inner_mac_len;
>>                skb->inner_network_header = inner_net;
>>                skb->inner_transport_header = inner_trans;
>> -               skb_set_inner_protocol(skb, skb->protocol);
>> +
>> +               if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP &&
>> +                   inner_mac_len == ETH_HLEN)
>> +                       skb_set_inner_protocol(skb, htons(ETH_P_TEB));
> 
> This may be used by vxlan, but it does not imply it.
> 
> Adding ETH_HLEN bytes likely means pushing an Ethernet header, but same point.
> 
> Conversely, pushing an Ethernet header is not limited to UDP encap.
> 
> This probably needs a new explicit BPF_F_ADJ_ROOM_.. flag, rather than
> trying to infer from imprecise heuristics.