netdev - Re: [PATCH v15 ] net/veth/XDP: Line-rate packet forwarding in kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 3 Apr 2018 11:00:20 -0600
From:   David Ahern <dsahern@...il.com>
To:     John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     "Md. Islam" <mislam4@...t.edu>, netdev@...r.kernel.org,
        David Miller <davem@...emloft.net>, stephen@...workplumber.org,
        agaceph@...il.com, Pavel Emelyanov <xemul@...nvz.org>,
        Eric Dumazet <edumazet@...gle.com>, brouer@...hat.com
Subject: Re: [PATCH v15 ] net/veth/XDP: Line-rate packet forwarding in kernel

On 4/3/18 10:41 AM, John Fastabend wrote:
> On 04/03/2018 08:07 AM, David Ahern wrote:
>> On 4/2/18 12:16 PM, Alexei Starovoitov wrote:
>>> On Mon, Apr 02, 2018 at 12:09:44PM -0600, David Ahern wrote:
>>>> On 4/2/18 12:03 PM, John Fastabend wrote:
>>>>>
>>>>> Can the above be a normal BPF helper that returns an
>>>>> ifindex? Then something roughly like this patter would
>>>>> work for all drivers with redirect support,
>>>>>
>>>>>
>>>>>      route_ifindex = ip_route_lookup(__daddr, ....)
>>>>>      if (!route_ifindex)
>>>>>            return do_foo()
>>>>>      return xdp_redirect(route_ifindex);
>>>>>      
>>>>> So my suggestion is,
>>>>>
>>>>>   1. enable veth xdp (including redirect support)
>>>>>   2. add a helper to lookup route from routing table
>>>>>
>>>>> Alternatively you can skip step (2) and encode the routing
>>>>> table in BPF directly. Maybe we need a more efficient data
>>>>> structure but that should also work.
>>>>>
>>>>
>>>> That's what I have here:
>>>>
>>>> https://github.com/dsahern/linux/commit/bab42f158c0925339f7519df7fb2cde8eac33aa8
>>>
>>> was wondering what's up with the delay and when are you going to
>>> submit them officially...
>>> The use case came up several times.
>>>
>>
>> I need to find time to come back to that set. As I recall there a number
>> of outstanding issues:
>>
>> 1. you and Daniel had comments about the bpf_func_proto declarations
>>
>> 2. Jesper had concerns about xdp redirect to any netdev. e.g., How does
>> the lookup know the egress netdev supports xdp? Right now you can try
>> and the packet is dropped if it is not supported.
>>
> 
> There should probably be a tracepoint there if not already. Otherwise
> I think the orchestration/loader layer should be ensuring that xdp
> support is sufficient. I don't think we need anything specific in the
> XDP/BPF code to handle unsupported devices.

ok. I am fine with starting with that.

> 
>> 3. VLAN devices. I suspect these will affect the final bpf function
>> prototype. It would awkward to have 1 forwarding API for non-vlan
>> devices and a second for vlan devices, hence the need to resolve this
>> before it goes in.
>>
> 
> Interesting. Do we need stacked XDP, I could imagine having 802.1Q
> simply call the lower dev XDP xmit routine. Possibly adding the 8021q
> header first.
> 
> Or alternatively a new dev type could let users query things like
> vlan-id from the dev rather than automatically doing the tagging. I
> suspect though if you forward to a vlan device automatically adding
> the tag is the correct behavior.
> 
> 
>> 4. What about other stacked devices - bonds and bridges - will those
>> just work with the bpf helper? VRF is already handled of course. ;-)
>>
> 
> So if we simply handle this like other stacked devices and call the
> lower devs xdp_xmit routine we should get reasonable behavior. For
> bonds and bridges I guess some generalization is needed though because
> everything at the moment is skb centric. I don't think its necessary
> in the first series though. It can be added later.
> 

For 3 and 4 above I was referring to the route lookup part of it; sorry
for not being clear.

For example, eth1 is enslaved to bond1 which is in VRF red. The lookup
needs to go to the table associated with the VRF. That is not known by
just looking at eth1. The code exists to walk the upper layers and do
the effective translations, just need to cover those cases.

The VLAN part of it is a bit more difficult - ingress device for the
lookup should be eth1.100 for example not eth1, and then if eth1.100 is
enslaved to a VRF, ...

None of it is that complex, just need to walk through the various use
cases and make sure bpf_ipv4_fwd_lookup and bpf_ipv6_fwd_lookup can do
the right thing for these common use cases.

Handling lwt and mpls for example, can certainly be a follow on.