netdev - Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5c22ee38-e2c3-0724-5033-603d19c4169f@iogearbox.net>
Date:   Fri, 2 Oct 2020 21:53:27 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     John Fastabend <john.fastabend@...il.com>,
        Lorenzo Bianconi <lorenzo@...nel.org>, bpf@...r.kernel.org,
        netdev@...r.kernel.org
Cc:     davem@...emloft.net, kuba@...nel.org, ast@...nel.org,
        shayagr@...zon.com, sameehj@...zon.com, dsahern@...nel.org,
        brouer@...hat.com, lorenzo.bianconi@...hat.com, echaudro@...hat.com
Subject: Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer
 support

On 10/2/20 5:25 PM, John Fastabend wrote:
> Lorenzo Bianconi wrote:
>> This series introduce XDP multi-buffer support. The mvneta driver is
>> the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
>> please focus on how these new types of xdp_{buff,frame} packets
>> traverse the different layers and the layout design. It is on purpose
>> that BPF-helpers are kept simple, as we don't want to expose the
>> internal layout to allow later changes.
>>
>> For now, to keep the design simple and to maintain performance, the XDP
>> BPF-prog (still) only have access to the first-buffer. It is left for
>> later (another patchset) to add payload access across multiple buffers.
>> This patchset should still allow for these future extensions. The goal
>> is to lift the XDP MTU restriction that comes with XDP, but maintain
>> same performance as before.
>>
>> The main idea for the new multi-buffer layout is to reuse the same
>> layout used for non-linear SKB. This rely on the "skb_shared_info"
>> struct at the end of the first buffer to link together subsequent
>> buffers. Keeping the layout compatible with SKBs is also done to ease
>> and speedup creating an SKB from an xdp_{buff,frame}. Converting
>> xdp_frame to SKB and deliver it to the network stack is shown in cpumap
>> code (patch 13/13).
> 
> Using the end of the buffer for the skb_shared_info struct is going to
> become driver API so unwinding it if it proves to be a performance issue
> is going to be ugly. So same question as before, for the use case where
> we receive packet and do XDP_TX with it how do we avoid cache miss
> overhead? This is not just a hypothetical use case, the Facebook
> load balancer is doing this as well as Cilium and allowing this with
> multi-buffer packets >1500B would be useful.
[...]

Fully agree. My other question would be if someone else right now is in the process
of implementing this scheme for a 40G+ NIC? My concern is the numbers below are rather
on the lower end of the spectrum, so I would like to see a comparison of XDP as-is
today vs XDP multi-buff on a higher end NIC so that we have a picture how well the
current designed scheme works there and into which performance issue we'll run e.g.
under typical XDP L4 load balancer scenario with XDP_TX. I think this would be crucial
before the driver API becomes 'sort of' set in stone where others start to adapting
it and changing design becomes painful. Do ena folks have an implementation ready as
well? And what about virtio_net, for example, anyone committing there too? Typically
for such features to land is to require at least 2 drivers implementing it.

>> Typical use cases for this series are:
>> - Jumbo-frames
>> - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])
>> - TSO
>>
>> More info about the main idea behind this approach can be found here [1][2].
>>
>> We carried out some throughput tests in a standard linear frame scenario in order
>> to verify we did not introduced any performance regression adding xdp multi-buff
>> support to mvneta:
>>
>> offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor size is one PAGE
>>
>> commit: 879456bedbe5 ("net: mvneta: avoid possible cache misses in mvneta_rx_swbm")
>> - xdp-pass:      ~162Kpps
>> - xdp-drop:      ~701Kpps
>> - xdp-tx:        ~185Kpps
>> - xdp-redirect:  ~202Kpps
>>
>> mvneta xdp multi-buff:
>> - xdp-pass:      ~163Kpps
>> - xdp-drop:      ~739Kpps
>> - xdp-tx:        ~182Kpps
>> - xdp-redirect:  ~202Kpps
[...]