netdev - Re: [RFC PATCH net-next V2 0/6] XDP rx handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180814121734.105769fa@redhat.com>
Date:   Tue, 14 Aug 2018 12:17:34 +0200
From:   Jesper Dangaard Brouer <jbrouer@...hat.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        ast@...nel.org, daniel@...earbox.net, mst@...hat.com
Subject: Re: [RFC PATCH net-next V2 0/6] XDP rx handler

On Tue, 14 Aug 2018 15:59:01 +0800
Jason Wang <jasowang@...hat.com> wrote:

> On 2018年08月14日 08:32, Alexei Starovoitov wrote:
> > On Mon, Aug 13, 2018 at 11:17:24AM +0800, Jason Wang wrote:  
> >> Hi:
> >>
> >> This series tries to implement XDP support for rx hanlder. This would
> >> be useful for doing native XDP on stacked device like macvlan, bridge
> >> or even bond.
> >>
> >> The idea is simple, let stacked device register a XDP rx handler. And
> >> when driver return XDP_PASS, it will call a new helper xdp_do_pass()
> >> which will try to pass XDP buff to XDP rx handler directly. XDP rx
> >> handler may then decide how to proceed, it could consume the buff, ask
> >> driver to drop the packet or ask the driver to fallback to normal skb
> >> path.
> >>
> >> A sample XDP rx handler was implemented for macvlan. And virtio-net
> >> (mergeable buffer case) was converted to call xdp_do_pass() as an
> >> example. For ease comparision, generic XDP support for rx handler was
> >> also implemented.
> >>
> >> Compared to skb mode XDP on macvlan, native XDP on macvlan (XDP_DROP)
> >> shows about 83% improvement.  
> > I'm missing the motiviation for this.
> > It seems performance of such solution is ~1M packet per second.  
> 
> Notice it was measured by virtio-net which is kind of slow.
> 
> > What would be a real life use case for such feature ?  
> 
> I had another run on top of 10G mlx4 and macvlan:
> 
> XDP_DROP on mlx4: 14.0Mpps
> XDP_DROP on macvlan: 10.05Mpps
> 
> Perf shows macvlan_hash_lookup() and indirect call to 
> macvlan_handle_xdp() are the reasons for the number drop. I think the 
> numbers are acceptable. And we could try more optimizations on top.
> 
> So here's real life use case is trying to have an fast XDP path for rx 
> handler based device:
> 
> - For containers, we can run XDP for macvlan (~70% of wire speed). This 
> allows a container specific policy.
> - For VM, we can implement macvtap XDP rx handler on top. This allow us 
> to forward packet to VM without building skb in the setup of macvtap.
> - The idea could be used by other rx handler based device like bridge, 
> we may have a XDP fast forwarding path for bridge.
> 
> >
> > Another concern is that XDP users expect to get line rate performance
> > and native XDP delivers it. 'generic XDP' is a fallback only
> > mechanism to operate on NICs that don't have native XDP yet.  
> 
> So I can replace generic XDP TX routine with a native one for macvlan.

If you simply implement ndo_xdp_xmit() for macvlan, and instead use
XDP_REDIRECT, then we are basically done.


> > Toshiaki's veth XDP work fits XDP philosophy and allows
> > high speed networking to be done inside containers after veth.
> > It's trying to get to line rate inside container.  
> 
> This is one of the goal of this series as well. I agree veth XDP work 
> looks pretty fine, but it only work for a specific setup I believe since 
> it depends on XDP_REDIRECT which is supported by few drivers (and 
> there's no VF driver support). 

The XDP_REDIRECT (RX-side) is trivial to add to drivers.  It is a bad
argument that only a few drivers implement this.  Especially since all
drivers also need to be extended with your proposed xdp_do_pass() call.

(rant) The thing that is delaying XDP_REDIRECT adaption in drivers, is
that it is harder to implement the TX-side, as the ndo_xdp_xmit() call
have to allocate HW TX-queue resources.  If we disconnect RX and TX
side of redirect, then we can implement RX-side in an afternoon.


> And in order to make it work for a end 
> user, the XDP program still need logic like hash(map) lookup to 
> determine the destination veth.

That _is_ the general idea behind XDP and eBPF, that we need to add logic
that determine the destination.  The kernel provides the basic
mechanisms for moving/redirecting packets fast, and someone else
builds an orchestration tool like Cilium, that adds the needed logic.

Did you notice that we (Ahern) added bpf_fib_lookup a FIB route lookup
accessible from XDP.

For macvlan, I imagine that we could add a BPF helper that allows you
to lookup/call macvlan_hash_lookup().

 
> > This XDP rx handler stuff is destined to stay at 1Mpps speeds forever
> > and the users will get confused with forever slow modes of XDP.
> >
> > Please explain the problem you're trying to solve.
> > "look, here I can to XDP on top of macvlan" is not an explanation of the problem.
> >  


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer