lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 30 Jul 2018 19:49:04 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Björn Töpel <bjorn.topel@...il.com>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Björn Töpel <bjorn.topel@...el.com>,
        "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        oss-drivers@...ronome.com, Netdev <netdev@...r.kernel.org>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        Francois Ozog <francois.ozog@...aro.org>,
        MykytaI Iziumtsev <mykyta.iziumtsev@...aro.org>
Subject: Re: [RFC bpf-next 0/6] net: xsk: minor improvements around queue
 handling

On Mon, 30 Jul 2018 14:49:32 +0200, Björn Töpel wrote:
> Den tors 26 juli 2018 kl 23:44 skrev Jakub Kicinski:
> >
> > Hi!
> 
> Thanks for spending your time on this, Jakub. I'm (temporarily) back
> for a week, so you can expect faster replies now...
> 
> > This set tries to make the core take care of error checking for the
> > drivers.  In particular making sure that the AF_XDP UMEM is not installed
> > on queues which don't exist (or are disabled) and that changing queue
> > (AKA ethtool channel) count cannot disable queues with active AF_XDF
> > zero-copy sockets.
> >
> > I'm sending as an RFC because I'm not entirely sure what the desired
> > behaviour is here.  Is it Okay to install AF_XDP on queues which don't
> > exist?  I presume not?  
> 
> Your presumption is correct. The idea with the
> real_num_rx_queues/real_num_tx_queues check in xsk_bind, is to bail
> out if the queue doesn't exist at bind call. Note that we *didn't* add
> any code to avoid the bound queue from being removed via set channel
> (your patch 6). Our idea was that if you remove a queue, the ingress
> frames would simply stop flowing, and the queue config change checking
> was "out-of-band".
> 
> I think I prefer your approach, i.e. not allowing the channels/queues
> to change if they're bound to an AF_XDP socket. However, your
> xdp_umem_query used in ethtool only works for ZC enabled drivers, not
> for the existing non-ZC/copy case. If we'd like to go the route of
> disabling ethtool_set_channels for an AF_XDP enabled queue this
> functionality needs to move the query into netdev core, so we have a
> consistent behavior.

Agreed.  There seems to be no notification for changing the number of
queues and therefore no very clean way to solve this today.  The last
two patches are more of a courtesy to the drivers, to simplify the
data structure for holding the UMEMs.

I could argue that driver and stack are not really apples to apples.
Much like Generic XDP, the skb-based AF_XDP is basically a development
tool and last-resort fallback.  For TX driver will most likely allocate
separate queues, while skb-based will use the stack's queues.  These
are actually different queues.  Stack will also fallback to other queue
in __netdev_pick_tx() if number of queues changes.

But yes, preferably skb-based and ZC should behave the same..  

> > Are the AF_XDP queue_ids referring to TX queues
> > as well as RX queues in case of the driver?  I presume not?  
> 
> We've had a lot of discussions about this internally. Ideally, we'd
> like to give driver implementors the most freedom, and not enforcing a
> certain queue scheme for Tx. 

You say freedom I hear diverging implementations and per-driver
checks in user space ;-)

Practically speaking unless you take the xmit lock there is little
chance of reusing stack's TX queues, so you'd have to allocate a
separate queue one way or the other..  At which point the number of
stack's TX queues has no bearing on AF_XDP ZC.

> OTOH it makes it weird for the userland
> application *not* to have the same id, e.g. if a userland application
> would like to get stats or configure the AF_XDP bound Tx queue --
> which id is it? Should the Tx queue id  for an xsk be exposed in
> sysfs?

I'd not go there.

> If the id is *not* the same, would it be OK to change the number of
> channels and Tx would continue to operate correctly? A related
> question; An xsk with *only* Tx, should it be constrained by the
> number of (enabled) Rx queues?

Good question, are drivers even supposed to care about tx-only/rx-only?
From driver's perspective rx-only socket will simply never have
anything to transmit, and tx-only socket will never successfully
xdp_do_redirect().  So IMHO - yes, tx-only XSK still only cares about
RX queues, driver doesn't know no RX will ever happen.  The fact that
user has to populate the FILL queue on a tx-only socket may be counter
intuitive to many...

I was considering proposing a change to drop the *x-only option in the
net tree, I don't really see much use for it :S  And it potentially
creates weird corner cases.

> I'd be happy to hear some more opinions/thoughts on this...

+1

> > Should we try to prevent disabling queues which have non zero-copy
> > sockets installed as well? :S
> >  
> 
> Yes, the ZC/non-ZC case must be consistent IMO. See comment above.
> 
> > Anyway, if any of those patches seem useful and reasonable, please
> > let me know I will repost as non-RFC.
> >  
> 
> I definitely think patch 2 and 3 (and probably 1) should go as
> non-RFC!

Thanks Björn, I will submit the first three!

> Thanks for spotting that we're not holding the rtnl lock when checking
> the # queues (patch 4)!
> 
> Björn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ