netdev - Re: [PATCH v2 net-next 6/9] bpf: helpers: add bpf_xdp_adjust_mb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 9 Sep 2020 22:51:41 +0200
From:   Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
To:     John Fastabend <john.fastabend@...il.com>
Cc:     Lorenzo Bianconi <lorenzo@...nel.org>, netdev@...r.kernel.org,
        bpf@...r.kernel.org, davem@...emloft.net, brouer@...hat.com,
        echaudro@...hat.com, sameehj@...zon.com, kuba@...nel.org,
        daniel@...earbox.net, ast@...nel.org, shayagr@...zon.com,
        edumazet@...gle.com
Subject: Re: [PATCH v2 net-next 6/9] bpf: helpers: add
 bpf_xdp_adjust_mb_header helper

> > Lorenzo Bianconi wrote:
> > > > Lorenzo Bianconi wrote:
> > > > > > Lorenzo Bianconi wrote:
> > > 
> > > [...]
> > > 
> > > > > > > + *	Description
> > > > > > > + *		Adjust frame headers moving *offset* bytes from/to the second
> > > > > > > + *		buffer to/from the first one. This helper can be used to move
> > > > > > > + *		headers when the hw DMA SG does not copy all the headers in
> > > > > > > + *		the first fragment.
> > > > > 
> > > > > + Eric to the discussion
> > > > > 
> > 
> > [...]
> > 

[...]

> > 
> > Still in a normal L2/L3/L4 use case I expect all the headers you
> > need to be in the fist buffer so its unlikely for use cases that
> > send most traffic via XDP_TX for example to ever need the extra
> > info. In these cases I think you are paying some penalty for
> > having to do the work of populating the shinfo. Maybe its measurable
> > maybe not I'm not sure.
> > 
> > Also if we make it required for multi-buffer than we also need
> > the shinfo on 40gbps or 100gbps nics and now even small costs
> > matter.
> 
> Now I realized I used the word "split" in a not clear way here,
> I apologize for that.
> What I mean is not related "header" split, I am referring to the case where
> the hw is configured with a given rx buffer size (e.g. 1 PAGE) and we have
> set a higher MTU/max received size (e.g. 9K).
> In this case the hw will "split" the jumbo received frame over multiple rx
> buffers/descriptors. Populating the "xdp_shared_info" we will forward this
> layout info to the eBPF sandbox and to a remote driver/cpu.
> Please note this use case is not currently covered by XDP so if we develop it a
> proper way I guess we should not get any performance hit for the legacy single-buffer
> mode since we will not populate the shared_info for it (I think you refer to
> the "legacy" use-case in your "normal L2/L3/L4" example, right?)
> Anyway I will run some tests to verify the performances for the single buffer
> use-case are not hit.
> 
> Regards,
> Lorenzo

I carried out some performance measurements on my Espressobin to check if the
XDP "single buffer" use-case has been hit introducing xdp multi-buff support.
Each test has been carried out sending ~900Kpps (pkt length 64B). The rx
buffer size was set to 1 PAGE (default value).
The results are roughly the same:

commit: f2ca673d2cd5 "net: mvneta: fix use of state->speed"
==========================================================
- XDP-DROP: ~ 740 Kpps
- XDP-TX: ~ 286 Kpps
- XDP-PASS + tc drop: ~ 219.5 Kpps

xdp multi-buff:
===============
- XDP-DROP: ~ 739-740 Kpps
- XDP-TX: ~ 285 Kpps
- XDP-PASS + tc drop: ~ 223 Kpps

I will add these results to v3 cover letter.

Regards,
Lorenzo

> 
> > 
> > > 
> > > > 
> > > > If you take the simplest possible program that just returns XDP_TX
> > > > and run a pkt generator against it. I believe (haven't run any
> > > > tests) that you will see overhead now just from populating this
> > > > shinfo. I think it needs to only be done when its needed e.g. when
> > > > user makes this helper call or we need to build the skb and populate
> > > > the frags there.
> > > 
> > > sure, I will carry out some tests.
> > 
> > Thanks!
> > 
> > > 
> > > > 
> > > > I think a smart driver will just keep the frags list in whatever
> > > > form it has them (rx descriptors?) and push them over to the
> > > > tx descriptors without having to do extra work with frag lists.
> > > 
> > > I think there are many use-cases where we want to have this info available in
> > > xdp_buff/xdp_frame. E.g: let's consider the following Jumbo frame example:
> > > - MTU > 1 PAGE (so we the driver will split the received data in multiple rx
> > >   descriptors)
> > > - the driver performs a XDP_REDIRECT to a veth or cpumap
> > > 
> > > Relying on the proposed architecture we could enable GRO in veth or cpumap I
> > > guess since we can build a non-linear skb from the xdp multi-buff, right?
> > 
> > I'm not disputing there are use-cases. But, I'm trying to see if we
> > can cover those without introducing additional latency in other
> > cases. Hence the extra benchmarks request ;)
> > 
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > Did you benchmark this?
> > > > > 
> > > > > will do, I need to understand if we can use tiny buffers in mvneta.
> > > > 
> > > > Why tiny buffers? How does mvneta layout the frags when doing
> > > > header split? Can we just benchmark what mvneta is doing at the
> > > > end of this patch series?
> > > 
> > > for the moment mvneta can split the received data when the previous buffer is
> > > full (e.g. when we the first page is completely written). I want to explore if
> > > I can set a tiny buffer (e.g. 128B) as max received buffer to run some performance
> > > tests and have some "comparable" results respect to the ones I got when I added XDP
> > > support to mvneta.
> > 
> > OK would be great.
> > 
> > > 
> > > > 
> > > > Also can you try the basic XDP_TX case mentioned above.
> > > > I don't want this to degrade existing use cases if at all
> > > > possible.
> > > 
> > > sure, will do.
> > 
> > Thanks!
> > 



Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)