lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 May 2019 22:54:44 +0200
From:   Björn Töpel <bjorn.topel@...il.com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     Toke Høiland-Jørgensen <toke@...hat.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>,
        Björn Töpel <bjorn.topel@...el.com>,
        "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next 1/2] net: xdp: refactor XDP_QUERY_PROG{,_HW} to netdev

On Wed, 22 May 2019 at 20:32, Jakub Kicinski
<jakub.kicinski@...ronome.com> wrote:
>
> On Wed, 22 May 2019 14:53:51 +0200, Björn Töpel wrote:
> > From: Björn Töpel <bjorn.topel@...el.com>
> >
> > All XDP capable drivers need to implement the XDP_QUERY_PROG{,_HW}
> > command of ndo_bpf. The query code is fairly generic. This commit
> > refactors the query code up from the drivers to the netdev level.
> >
> > The struct net_device has gained four new members tracking the XDP
> > program, the corresponding program flags, and which programs
> > (SKB_MODE, DRV_MODE or HW_MODE) that are activated.
> >
> > The xdp_prog member, previously only used for SKB_MODE, is shared with
> > DRV_MODE, since they are mutually exclusive.
> >
> > The program query operations is all done under the rtnl lock. However,
> > the xdp_prog member is __rcu annotated, and used in a lock-less manner
> > for the SKB_MODE. This is because the xdp_prog member is shared from a
> > query program perspective (again, SKB and DRV are mutual exclusive),
> > RCU read and assignments functions are still used when altering
> > xdp_prog, even when only for queries in DRV_MODE. This is for
> > sparse/lockdep correctness.
> >
> > A minor behavioral change was done during this refactorization; When
> > passing a negative file descriptor to a netdev to disable XDP, the
> > same program flag as the running program is required. An example.
> >
> > The eth0 is DRV_DRV capable. Previously, this was OK, but confusing:
> >
> >   # ip link set dev eth0 xdp obj foo.o sec main
> >   # ip link set dev eth0 xdpgeneric off
> >
> > Now, the same commands give:
> >
> >   # ip link set dev eth0 xdp obj foo.o sec main
> >   # ip link set dev eth0 xdpgeneric off
> >   Error: native and generic XDP can't be active at the same time.
>
> I'm not clear why this change is necessary? It is a change in
> behaviour, and if anything returning ENOENT would seem cleaner
> in this case.
>

To me, the existing behavior was non-intuitive. If most people *don't*
agree, I'll remove this change. So, what do people think about this?
:-)

ENOENT does make more sense.

> > Signed-off-by: Björn Töpel <bjorn.topel@...el.com>
> > ---
> >  include/linux/netdevice.h |  13 +++--
> >  net/core/dev.c            | 120 ++++++++++++++++++++------------------
> >  net/core/rtnetlink.c      |  33 ++---------
> >  3 files changed, 76 insertions(+), 90 deletions(-)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 44b47e9df94a..bdcb20a3946c 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -1940,6 +1940,11 @@ struct net_device {
> >  #endif
> >       struct hlist_node       index_hlist;
> >
> > +     struct bpf_prog         *xdp_prog_hw;
> > +     unsigned int            xdp_flags;
> > +     u32                     xdp_prog_flags;
> > +     u32                     xdp_prog_hw_flags;
>
> Do we really need 3 xdp flags variables?  Offloaded programs
> realistically must have only the HW mode flag set (not sure if
> netdevsim still pretends we can do offload of code after rewrites,
> but if it does it can be changed :)).  Non-offloaded programs need
> flags, but we don't need the "all ORed" flags either, AFAICT.  No?
>

Good point, I'll try to reduce the netdev bloat.

> > +
> >  /*
> >   * Cache lines mostly used on transmit path
> >   */
> > @@ -2045,9 +2050,8 @@ struct net_device {
> >
> >  static inline bool netif_elide_gro(const struct net_device *dev)
> >  {
> > -     if (!(dev->features & NETIF_F_GRO) || dev->xdp_prog)
> > -             return true;
> > -     return false;
> > +     return !(dev->features & NETIF_F_GRO) ||
> > +             dev->xdp_flags & XDP_FLAGS_SKB_MODE;
> >  }
> >
> >  #define      NETDEV_ALIGN            32
> > @@ -3684,8 +3688,7 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
> >  typedef int (*bpf_op_t)(struct net_device *dev, struct netdev_bpf *bpf);
> >  int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
> >                     int fd, u32 flags);
> > -u32 __dev_xdp_query(struct net_device *dev, bpf_op_t xdp_op,
> > -                 enum bpf_netdev_command cmd);
> > +u32 dev_xdp_query(struct net_device *dev, unsigned int flags);
> >  int xdp_umem_query(struct net_device *dev, u16 queue_id);
> >
> >  int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb);
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index b6b8505cfb3e..ead16c3f955c 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -8005,31 +8005,31 @@ int dev_change_proto_down_generic(struct net_device *dev, bool proto_down)
> >  }
> >  EXPORT_SYMBOL(dev_change_proto_down_generic);
> >
> > -u32 __dev_xdp_query(struct net_device *dev, bpf_op_t bpf_op,
> > -                 enum bpf_netdev_command cmd)
> > +u32 dev_xdp_query(struct net_device *dev, unsigned int flags)
>
> You only pass the mode here, so perhaps rename the flags argument to
> mode?
>
> >  {
> > -     struct netdev_bpf xdp;
> > +     struct bpf_prog *prog = NULL;
> >
> > -     if (!bpf_op)
> > +     if (hweight32(flags) != 1)
> >               return 0;
>
> This looks superfluous, given callers will always pass mode, right?
>
> > -     memset(&xdp, 0, sizeof(xdp));
> > -     xdp.command = cmd;
> > -
> > -     /* Query must always succeed. */
> > -     WARN_ON(bpf_op(dev, &xdp) < 0 && cmd == XDP_QUERY_PROG);
> > +     if (flags & (XDP_FLAGS_SKB_MODE | XDP_FLAGS_DRV_MODE))
> > +             prog = rtnl_dereference(dev->xdp_prog);
> > +     else if (flags & XDP_FLAGS_HW_MODE)
> > +             prog = dev->xdp_prog_hw;
>
> Perhaps use a switch statement here?
>
> > -     return xdp.prog_id;
> > +     return prog ? prog->aux->id : 0;
> >  }
> >
> > -static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op,
> > +static int dev_xdp_install(struct net_device *dev, unsigned int target,
>
> Could you say mode instead of target everywhere?
>
> >                          struct netlink_ext_ack *extack, u32 flags,
> >                          struct bpf_prog *prog)
> >  {
> > +     bpf_op_t bpf_op = dev->netdev_ops->ndo_bpf;
>
> The point of passing bpf_op around is to have the right one (generic or
> driver) this is lost with the ternary statement below :S
>
> >       struct netdev_bpf xdp;
> > +     int err;
> >
> >       memset(&xdp, 0, sizeof(xdp));
> > -     if (flags & XDP_FLAGS_HW_MODE)
> > +     if (target == XDP_FLAGS_HW_MODE)
>
> Is this change necessary?
>
> >               xdp.command = XDP_SETUP_PROG_HW;
> >       else
> >               xdp.command = XDP_SETUP_PROG;
> > @@ -8037,35 +8037,41 @@ static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op,
> >       xdp.flags = flags;
> >       xdp.prog = prog;
> >
> > -     return bpf_op(dev, &xdp);
> > +     err = (target == XDP_FLAGS_SKB_MODE) ?
>
> Brackets unnecessary.
>
> > +           generic_xdp_install(dev, &xdp) :
> > +           bpf_op(dev, &xdp);
> > +     if (!err) {
>
> Keep success path unindented, save indentation.
>
>         if (err)
>                 return err;
>
>         bla bla
>
>         return 0;
>
> > +             if (prog)
> > +                     dev->xdp_flags |= target;
> > +             else
> > +                     dev->xdp_flags &= ~target;
>
> These "all ORed" flags are not needed, AFAICT, as mentioned above.
>
> > +             if (target == XDP_FLAGS_HW_MODE) {
> > +                     dev->xdp_prog_hw = prog;
> > +                     dev->xdp_prog_hw_flags = flags;
> > +             } else if (target == XDP_FLAGS_DRV_MODE) {
> > +                     rcu_assign_pointer(dev->xdp_prog, prog);
> > +                     dev->xdp_prog_flags = flags;
> > +             }
> > +     }
> > +
> > +     return err;
> >  }
> >
> >  static void dev_xdp_uninstall(struct net_device *dev)
> >  {
> > -     struct netdev_bpf xdp;
> > -     bpf_op_t ndo_bpf;
> > -
> > -     /* Remove generic XDP */
> > -     WARN_ON(dev_xdp_install(dev, generic_xdp_install, NULL, 0, NULL));
> > -
> > -     /* Remove from the driver */
> > -     ndo_bpf = dev->netdev_ops->ndo_bpf;
> > -     if (!ndo_bpf)
> > -             return;
> > -
> > -     memset(&xdp, 0, sizeof(xdp));
> > -     xdp.command = XDP_QUERY_PROG;
> > -     WARN_ON(ndo_bpf(dev, &xdp));
> > -     if (xdp.prog_id)
> > -             WARN_ON(dev_xdp_install(dev, ndo_bpf, NULL, xdp.prog_flags,
> > -                                     NULL));
> > -
> > -     /* Remove HW offload */
> > -     memset(&xdp, 0, sizeof(xdp));
> > -     xdp.command = XDP_QUERY_PROG_HW;
> > -     if (!ndo_bpf(dev, &xdp) && xdp.prog_id)
> > -             WARN_ON(dev_xdp_install(dev, ndo_bpf, NULL, xdp.prog_flags,
> > -                                     NULL));
> > +     if (dev->xdp_flags & XDP_FLAGS_SKB_MODE) {
> > +             WARN_ON(dev_xdp_install(dev, XDP_FLAGS_SKB_MODE,
> > +                                     NULL, 0, NULL));
> > +     }
>
> Brackets unnecessary.
>
> > +     if (dev->xdp_flags & XDP_FLAGS_DRV_MODE) {
> > +             WARN_ON(dev_xdp_install(dev, XDP_FLAGS_DRV_MODE,
> > +                                     NULL, dev->xdp_prog_flags, NULL));
> > +     }
>
> You should be able to just call install with the original flags, and
> install handler should do the right maths again to direct it either to
> drv or generic, no?
>
> > +     if (dev->xdp_flags & XDP_FLAGS_HW_MODE) {
> > +             WARN_ON(dev_xdp_install(dev, XDP_FLAGS_HW_MODE,
> > +                                     NULL, dev->xdp_prog_hw_flags, NULL));
> > +     }
> >  }
> >
> >  /**
> > @@ -8080,41 +8086,41 @@ static void dev_xdp_uninstall(struct net_device *dev)
> >  int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
> >                     int fd, u32 flags)
> >  {
> > -     const struct net_device_ops *ops = dev->netdev_ops;
> > -     enum bpf_netdev_command query;
> > +     bool offload, drv_supp = !!dev->netdev_ops->ndo_bpf;
>
> Please avoid calculations in init.  If the function gets complicated
> this just ends up as a weirdly indented code, which is hard to read.
>
> >       struct bpf_prog *prog = NULL;
> > -     bpf_op_t bpf_op, bpf_chk;
> > -     bool offload;
> > +     unsigned int target;
> >       int err;
> >
> >       ASSERT_RTNL();
> >
> >       offload = flags & XDP_FLAGS_HW_MODE;
> > -     query = offload ? XDP_QUERY_PROG_HW : XDP_QUERY_PROG;
> > +     target = offload ? XDP_FLAGS_HW_MODE : XDP_FLAGS_DRV_MODE;
> >
> > -     bpf_op = bpf_chk = ops->ndo_bpf;
> > -     if (!bpf_op && (flags & (XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE))) {
> > +     if (!drv_supp && (flags & (XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE))) {
>
> Here you have a bracket for & inside an &&..
>
> >               NL_SET_ERR_MSG(extack, "underlying driver does not support XDP in native mode");
> >               return -EOPNOTSUPP;
> >       }
> > -     if (!bpf_op || (flags & XDP_FLAGS_SKB_MODE))
> > -             bpf_op = generic_xdp_install;
> > -     if (bpf_op == bpf_chk)
> > -             bpf_chk = generic_xdp_install;
> > +
> > +     if (!drv_supp || (flags & XDP_FLAGS_SKB_MODE))
> > +             target = XDP_FLAGS_SKB_MODE;
> > +
> > +     if ((mode == XDP_FLAGS_SKB_MODE &&
> > +          dev->xdp_flags & XDP_FLAGS_DRV_MODE) ||
>
> .. and here you don't :)
>
> > +         (target == XDP_FLAGS_DRV_MODE &&
> > +          dev->xdp_flags & XDP_FLAGS_SKB_MODE)) {
>
> I think this condition can be shortened.  You can't get a conflict if
> the driver has no support.  So the only conflicting case is:
>
>         rcu_access_pointer(dev->xdp_prog) && drv_supp &&
>         (flags ^ dev->xdp_flags) & XDP_FLAGS_SKB_MODE
>
> > +             NL_SET_ERR_MSG(extack, "native and generic XDP can't be active at the same time");
> > +             return -EEXIST;
> > +     }
> >
> >       if (fd >= 0) {
> > -             if (!offload && __dev_xdp_query(dev, bpf_chk, XDP_QUERY_PROG)) {
> > -                     NL_SET_ERR_MSG(extack, "native and generic XDP can't be active at the same time");
> > -                     return -EEXIST;
> > -             }
> > -             if ((flags & XDP_FLAGS_UPDATE_IF_NOEXIST) &&
> > -                 __dev_xdp_query(dev, bpf_op, query)) {
> > +             if (flags & XDP_FLAGS_UPDATE_IF_NOEXIST &&
> > +                 dev_xdp_query(dev, target)) {
> >                       NL_SET_ERR_MSG(extack, "XDP program already attached");
> >                       return -EBUSY;
> >               }
> >
> > -             prog = bpf_prog_get_type_dev(fd, BPF_PROG_TYPE_XDP,
> > -                                          bpf_op == ops->ndo_bpf);
> > +             prog = bpf_prog_get_type_dev(fd, BPF_PROG_TYPE_XDP, drv_supp);
> > +
>
> Extra new line.
>
> >               if (IS_ERR(prog))
> >                       return PTR_ERR(prog);
> >
> > @@ -8125,7 +8131,7 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
> >               }
> >       }
> >
> > -     err = dev_xdp_install(dev, bpf_op, extack, flags, prog);
> > +     err = dev_xdp_install(dev, target, extack, flags, prog);
> >       if (err < 0 && prog)
> >               bpf_prog_put(prog);
> >
>
> I think apart from returning the new error it looks functionally
> correct :)

Thanks for the review, Jakub. Very valuable. I'll address all your
comments in the v3!


Cheers,
Björn

Powered by blists - more mailing lists