lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170829152601.02fdb8a3@redhat.com>
Date:   Tue, 29 Aug 2017 15:26:01 +0200
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     Andy Gospodarek <andy@...yhouse.net>,
        Michael Chan <michael.chan@...adcom.com>,
        John Fastabend <john.fastabend@...il.com>,
        "Duyck, Alexander H" <alexander.h.duyck@...el.com>,
        "pstaszewski@...are.pl" <pstaszewski@...are.pl>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "xdp-newbies@...r.kernel.org" <xdp-newbies@...r.kernel.org>,
        "borkmann@...earbox.net" <borkmann@...earbox.net>,
        brouer@...hat.com
Subject: Re: XDP redirect measurements, gotchas and tracepoints


On Mon, 28 Aug 2017 09:11:25 -0700 Alexander Duyck <alexander.duyck@...il.com> wrote:

> My advice would be to not over complicate this. My big concern with
> all this buffer recycling is what happens the first time somebody
> introduces something like mirroring? Are you going to copy the data to
> a new page which would be quite expensive or just have to introduce
> reference counts? You are going to have to deal with stuff like
> reference counts eventually so you might as well bite that bullet now.
> My advice would be to not bother with optimizing for performance right
> now and instead focus on just getting functionality. The approach we
> took in ixgbe for the transmit path should work for almost any other
> driver since all you are looking at is having to free the page
> reference which takes care of reference counting already.

This return API is not about optimizing performance right now.  It is
actually about allowing us to change the underlying memory model per RX
queue for XDP.

If a RX-ring is use for both SKBs and XDP, then the refcnt model is
still enforced.  Although a driver using the 1-packet-per-page model,
should be able to reuse refcnt==1 pages when returned from XDP.

If a RX-ring is _ONLY_ used for XDP, then the driver have freedom to
implement another memory model, with the return-API.  We need to
experiment with the most optimal memory model.  The 1-packet-per-page
model is actually not the fastest, because of PCI-e bottlenecks.  With
HW support for packing descriptors and packets over the PCI-e bus, much
higher rates can be achieved.  Mellanox mlx5-Lx already have the needed HW
support.  And companies like NetCope also have 100G HW that does
similar tricks, and they even have a whitepaper[1][2] how they are
faster than DPDK with their NDP (Netcope Data Plane) API.

We do need the ability/flexibility to change the RX memory model, to
take advantage of this new NIC hardware.

[1] https://www.netcope.com/en/resources/improving-dpdk-performance
[2] https://www.netcope.com/en/company/press-center/press-releases/read-new-netcope-whitepaper-on-dpdk-acceleration

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ