lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ8uoz2Bwrj2iMBn-DA6km=Zac45XigegdAOQQ9rWQkJMz8P=Q@mail.gmail.com>
Date:   Tue, 2 Oct 2018 14:49:13 +0200
From:   Magnus Karlsson <magnus.karlsson@...il.com>
To:     jakub.kicinski@...ronome.com
Cc:     "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        Björn Töpel <bjorn.topel@...el.com>,
        ast@...nel.org, Daniel Borkmann <daniel@...earbox.net>,
        Network Development <netdev@...r.kernel.org>,
        Jesper Dangaard Brouer <brouer@...hat.com>
Subject: Re: [PATCH bpf-next v2 0/5] xsk: fix bug when trying to use both copy
 and zero-copy mode

On Mon, Oct 1, 2018 at 10:34 PM Jakub Kicinski
<jakub.kicinski@...ronome.com> wrote:
>
> On Mon,  1 Oct 2018 14:51:32 +0200, Magnus Karlsson wrote:
> > Jakub, please take a look at your patches. The last one I had to
> > change slightly to make it fit with the new interface
> > xdp_get_umem_from_qid(). An added bonus with this function is that we,
> > in the future, can also use it from the driver to get a umem, thus
> > simplifying driver implementations (and later remove the umem from the
> > NDO completely). Björn will mail patches, at a later point in time,
> > using this in the i40e and ixgbe drivers, that removes a good chunk of
> > code from the ZC implementations.
>
> Nice, drivers which don't follow the prepare/commit model of handling
> reconfigurations will benefit!
>
> > I also made your code aware of Tx queues. If we create a socket that
> > only has a Tx queue, then the queue id will refer to a Tx queue id
> > only and could be larger than the available amount of Rx queues.
> > Please take a look at it.
>
> The semantics of Tx queue id are slightly unclear.  To me XDP is
> associated with Rx, so the qid in driver context can only refer to
> Rx queue and its associated XDP Tx queue.  It does not mean the Tx
> queue stack uses, like it does for copy fallback.  If one doesn't have
> a Rx queue $id, there will be no associated XDP Tx queue $id (in all
> drivers but Intel, and virtio, which use per-CPU Tx queues making TX
> queue even more meaningless).
>
> Its to be seen how others implement AF_XDP.  My general feeling is
> that we should only talk about Rx queues in context of driver XDP.

This is the way I see it. From an uapi point of view we can create a
socket that can only do Rx, only Tx or both. We then bind this socket
to a specific queue id on a device. If a packet is received on this
queue id it is sent (by the default xdpsock sample program) to the
socket. If a packet is sent on this socket it goes out on this same
queue id. If you have not registered an Rx ring (in user space) for
this socket, you cannot receive anything on this socket. And
conversely, if you have no Tx ring, you will not be able to send
anything.

But if we take a look at this from the driver perspective and the NDO
XDP_SETUP_XSK_UMEM, today it does not know anything about if Rx and Tx
rings have been setup in the socket. It will always initialize the HW
Rx and Tx queues of the supplied queue id. So with today's NDO
interface you will always get a Rx/Tx queue pair. In order to realize
the uapi above in an efficient manner and to support devices with more
Tx queues than Rx, we need to change the NDO.

Just as a note, in the applications I am used to work on, radio base
stations and other telecom apps, it is the common case to have many
more Tx queues than Rx queues just to be able to use scheduling,
shaping and other QoS features that are important on egress in those
systems. Therefore the interest in supporting Tx only queues. But
maybe this is just a weird case, do not know.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ