lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180920074143.0add8cc4@cakuba.netronome.com>
Date:   Thu, 20 Sep 2018 07:41:43 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Magnus Karlsson <magnus.karlsson@...il.com>
Cc:     ys114321@...il.com, "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        Björn Töpel <bjorn.topel@...el.com>,
        ast@...nel.org, Daniel Borkmann <daniel@...earbox.net>,
        Network Development <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf-next 2/2] xsk: fix bug when trying to use both copy
 and zero-copy on one queue id

On Thu, 20 Sep 2018 11:17:17 +0200, Magnus Karlsson wrote:
> On Wed, Sep 19, 2018 at 9:18 AM Magnus Karlsson wrote:
> > On Wed, Sep 19, 2018 at 3:58 AM Jakub Kicinski wrote:  
> > >
> > > On Tue, 18 Sep 2018 10:22:11 -0700, Y Song wrote:  
> > > > > +/* The umem is stored both in the _rx struct and the _tx struct as we do
> > > > > + * not know if the device has more tx queues than rx, or the opposite.
> > > > > + * This might also change during run time.
> > > > > + */
> > > > > +static void xdp_reg_umem_at_qid(struct net_device *dev, struct xdp_umem *umem,
> > > > > +                               u16 queue_id)
> > > > > +{
> > > > > +       if (queue_id < dev->real_num_rx_queues)
> > > > > +               dev->_rx[queue_id].umem = umem;
> > > > > +       if (queue_id < dev->real_num_tx_queues)
> > > > > +               dev->_tx[queue_id].umem = umem;
> > > > > +}
> > > > > +
> > > > > +static struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev,
> > > > > +                                             u16 queue_id)
> > > > > +{
> > > > > +       if (queue_id < dev->real_num_rx_queues)
> > > > > +               return dev->_rx[queue_id].umem;
> > > > > +       if (queue_id < dev->real_num_tx_queues)
> > > > > +               return dev->_tx[queue_id].umem;
> > > > > +
> > > > > +       return NULL;
> > > > > +}
> > > > > +
> > > > > +static void xdp_clear_umem_at_qid(struct net_device *dev, u16 queue_id)
> > > > > +{
> > > > > +       /* Zero out the entry independent on how many queues are configured
> > > > > +        * at this point in time, as it might be used in the future.
> > > > > +        */
> > > > > +       if (queue_id < dev->num_rx_queues)
> > > > > +               dev->_rx[queue_id].umem = NULL;
> > > > > +       if (queue_id < dev->num_tx_queues)
> > > > > +               dev->_tx[queue_id].umem = NULL;
> > > > > +}
> > > > > +  
> > > >
> > > > I am sure whether the following scenario can happen or not.
> > > > Could you clarify?
> > > >    1. suppose initially we have num_rx_queues = num_tx_queues = 10
> > > >        xdp_reg_umem_at_qid() set umem1 to queue_id = 8
> > > >    2. num_tx_queues is changed to 5
> > > >    3. xdp_clear_umem_at_qid() is called for queue_id = 8,
> > > >        and dev->_rx[8].umum = 0.
> > > >    4. xdp_reg_umem_at_qid() is called gain to set for queue_id = 8
> > > >        dev->_rx[8].umem = umem2
> > > >    5. num_tx_queues is changed to 10
> > > >   Now dev->_rx[8].umem != dev->_tx[8].umem, is this possible and is it
> > > > a problem?  
> > >
> > > Plus IIRC the check of qid vs real_num_[rt]x_queues in xsk_bind() is
> > > not under rtnl_lock so it doesn't count for much.  Why not do all the
> > > checks against num_[rt]x_queues here, instead of real_..?  
> >
> > You are correct, two separate rtnl_lock regions is broken. Will spin a
> > v2 tomorrow when I am back in the office.
> >
> > Thanks Jakub for catching this. I really appreciate you reviewing my code.  
> 
> Sorry, forgot to answer your question about why real_num_ instead of
> num_. If we used num_ instead, we would get a behavior where we can
> open a socket on a queue id, let us say 10, that will not be active if
> real_num_ is below 10. So no traffic will flow on this socket until we
> issue an ethtool command to state that we have 10 or more queues
> configured. While this behavior is sane (and consistent), I believe it
> will lead to a more complicated driver implementation, break the
> current uapi (since we will return an error if you try to bind to a
> queue id that is not active at the moment) and I like my suggestion
> below better :-).
> 
> What I would like to suggest is that the current model at bind() is
> kept, that is it will fail if you try to bind to a non-active queue
> id. But we add some code in ethool_set_channels in net/core/ethtool.c
> to check if you have an active af_xdp socket on a queue id and try to
> make it inactive, we return an error and disallow the change. This
> would IMHO be like not allowing a load module to be unloaded if
> someone is using it or not allowing unmounting a file system if files
> are in use. What do you think? If you think this is the right way to
> go, I can implement this in a follow up patch or in this patch set if
> that makes more sense, let me know. This suggestion would actually
> make it simpler to implement zero-copy support in the driver. Some of
> the complications we are having today is due to the fact that ethtool
> can come in from the side and change things for us when an AF_XDP
> socket is active on a queue id. And implementing zero copy support in
> the driver needs to get simpler IMO.
> 
> Please let me know what you think.

I think I'd like that:
https://patchwork.ozlabs.org/project/netdev/list/?series=57843&state=*
in particular:
https://patchwork.ozlabs.org/patch/949891/
;)

If we fix the resize indeed you could stick to real_* but to be honest,
I'm not entirely clear on what the advantage would be?  The arrays are
sized to num..queues, do you envision a use case where having
xdp_get_umem_from_qid() return a NULL when asked for qid > real_..
would help?  Is there one for xdp_reg_umem_at_qid()

I have no strong preference here, just wondering.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ