[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <16b923bc-5e2f-624d-2e14-ddf6836d16f6@gmail.com>
Date: Mon, 27 Apr 2020 12:58:22 -0600
From: David Ahern <dsahern@...il.com>
To: John Fastabend <john.fastabend@...il.com>,
David Ahern <dsahern@...nel.org>, netdev@...r.kernel.org
Cc: davem@...emloft.net, kuba@...nel.org,
prashantbhole.linux@...il.com, jasowang@...hat.com,
brouer@...hat.com, toke@...hat.com, toshiaki.makita1@...il.com,
daniel@...earbox.net, ast@...nel.org, kafai@...com,
songliubraving@...com, yhs@...com, andriin@...com
Subject: Re: [PATCH v3 bpf-next 00/15] net: Add support for XDP in egress path
On 4/27/20 12:25 PM, John Fastabend wrote:
> David Ahern wrote:
>> From: David Ahern <dsahern@...il.com>
>>
>> This series adds support for XDP in the egress path by introducing
>> a new XDP attachment type, BPF_XDP_EGRESS, and adding a UAPI to
>> if_link.h for attaching the program to a netdevice and reporting
>> the program. bpf programs can be run on all packets in the Tx path -
>> skbs or redirected xdp frames. The intent is to emulate the current
>> RX path for XDP as much as possible to maintain consistency and
>> symmetry in the 2 paths with their APIs.
>>
>> This is a missing primitive for XDP allowing solutions to build small,
>> targeted programs properly distributed in the networking path allowing,
>> for example, an egress firewall/ACL/traffic verification or packet
>> manipulation and encapping an entire ethernet frame whether it is
>> locally generated traffic, forwarded via the slow path (ie., full
>> stack processing) or xdp redirected frames.
>
> I'm still a bit unsure why the BPF programs would not push logic into
> ingress XDP program + skb egress. Is there a case where that does not
> work or is it mostly about ease of use for some use case?
host and VMs.
Some packets take the XDP fast path (known unicast traffic redirected
from host ingress to VM tap or from one VM tap to another VM tap); some
packets take the slow path. Regardless of path, each VM can have its own
per-VM data (e.g., ingress ACL). With XDP egress programs getting both
packet formats, that per-VM ACL config only needs to be in 1 place -
egress program map.
>
> Do we have overhead performance numbers? I'm wondering how close the
> redirect case with these TX hooks are vs redirect without TX hooks.
> The main reason I ask is if it slows performance down by more than say
> 5% (sort of made up number, but point is some N%) then I don't think
> we would recommend using it.
Toke ran some tests:
"On a test using xdp_redirect_map from samples/bpf, which gets 8.15 Mpps
normally, loading an XDP egress program on the target interface drops
performance to 7.55 Mpps. So ~600k pps, or ~9.5ns overhead for the
egress program."
###
If XDP redirect gives a 2-4x speedup over skb path but the existence of
an egress program takes away ~8-10% of that speedup, it is still an
overall huge win in performance with a much simpler design, architecture
and lifecycle management of per-VM data.
>
>>
>> Nothing about running a program in the Tx path requires driver specific
>> resources like the Rx path has. Thus, programs can be run in core
>> code and attached to the net_device struct similar to skb mode. The
>> egress attach is done using the new XDP_FLAGS_EGRESS_MODE flag, and
>> is reported by the kernel using the XDP_ATTACHED_EGRESS_CORE attach
>> flag with IFLA_XDP_EGRESS_PROG_ID making the api similar to existing
>> APIs for XDP.
>>
>> The locations chosen to run the egress program - __netdev_start_xmit
>> before the call to ndo_start_xmit and bq_xmit_all before invoking
>> ndo_xdp_xmit - allow follow on patch sets to handle tx queueing and
>> setting the queue index if multi-queue with consistency in handling
>> both packet formats.
>>
>> A few of the patches trace back to work done on offloading programs
>> from a VM by Jason Wang and Prashant Bole.
>
> The idea for offloading VM programs would be to take a BPF program
> from the VM somehow out of band or over mgmt interface and load it
> into the egress hook of virtio?
The latest thought is for programs to run in the vhost thread. Offloaded
programs for a guest should run in process context where the cycles can
be associated with the VM.
>
> Code LGTM other than a couple suggestions on the test side but I'm
> missing something on the use case picture.
thanks for the review.
Powered by blists - more mailing lists