netdev - Re: [PATCH bpf-next 1/2] bpf: add BPF_LWT_ENCAP_IP option to bpf_lwt_push

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <50f3135e-5e24-d152-b99c-ca86260fbe12@gmail.com>
Date:   Fri, 30 Nov 2018 13:08:03 -0700
From:   David Ahern <dsahern@...il.com>
To:     Peter Oskolkov <posk@...gle.com>
Cc:     ast@...nel.org, daniel@...earbox.net, netdev@...r.kernel.org,
        posk.devel@...il.com
Subject: Re: [PATCH bpf-next 1/2] bpf: add BPF_LWT_ENCAP_IP option to
 bpf_lwt_push_encap

On 11/28/18 6:34 PM, Peter Oskolkov wrote:
> On Wed, Nov 28, 2018 at 4:47 PM David Ahern <dsahern@...il.com> wrote:
>>
>> On 11/28/18 5:22 PM, Peter Oskolkov wrote:
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index bd0df75dc7b6..17f3c37218e5 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -4793,6 +4793,60 @@ static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len
>>>  }
>>>  #endif /* CONFIG_IPV6_SEG6_BPF */
>>>
>>> +static int bpf_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len)
>>> +{
>>> +     struct dst_entry *dst;
>>> +     struct rtable *rt;
>>> +     struct iphdr *iph;
>>> +     struct net *net;
>>> +     int err;
>>> +
>>> +     if (skb->protocol != htons(ETH_P_IP))
>>> +             return -EINVAL;  /* ETH_P_IPV6 not yet supported */
>>> +
>>> +     iph = (struct iphdr *)hdr;
>>> +
>>> +     if (unlikely(len < sizeof(struct iphdr) || len > LWTUNNEL_MAX_ENCAP_HSIZE))
>>> +             return -EINVAL;
>>> +     if (unlikely(iph->version != 4 || iph->ihl * 4 > len))
>>> +             return -EINVAL;
>>> +
>>> +     if (skb->sk)
>>> +             net = sock_net(skb->sk);
>>> +     else {
>>> +             net = dev_net(skb_dst(skb)->dev);
>>> +     }
>>> +     rt = ip_route_output(net, iph->daddr, 0, 0, 0);
>>
>> That is a very limited use case. e.g., oif = 0 means you are not
>> considering any kind of policy routing (e.g., VRF).
> 
> Hi David! Could you be a bit more specific re: what you would like to
> see here? Thanks!
> 

Is the encap happening on ingress or egress? Seems like the current code
does not assume either direction for lwt (BPF_PROG_TYPE_LWT_IN vs
BPF_PROG_TYPE_LWT_OUT), yet your change does - output only. Basically,
you should be filling in a flow struct and doing a proper lookup.

When the potentially custom encap header is pushed on, seems to me skb
marks should still be honored for the route lookup. If not, you should
handle that in the API.

>From there skb->dev at a minimum should be used as either iif (ingress)
or oif (egress).

The iph is already set so you have quick access to the tos.

Also, this should implement IPv6 as well before going in.