lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-KHFNm=nTJ9Ey2Hv-YLzSBXEnJZw+ruFU0fKBX9K=v0VQ@mail.gmail.com>
Date:   Fri, 3 Nov 2017 12:17:07 +0900
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Björn Töpel <bjorn.topel@...il.com>
Cc:     "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        Alexander Duyck <alexander.h.duyck@...el.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <ast@...com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        michael.lundkvist@...csson.com, ravineet.singh@...csson.com,
        Daniel Borkmann <daniel@...earbox.net>,
        Network Development <netdev@...r.kernel.org>,
        Björn Töpel <bjorn.topel@...el.com>,
        jesse.brandeburg@...el.com, anjali.singhai@...el.com,
        rami.rosen@...el.com, jeffrey.b.shaw@...el.com,
        ferruh.yigit@...el.com, qi.z.zhang@...el.com
Subject: Re: [RFC PATCH 07/14] packet: wire up zerocopy for AF_PACKET V4

On Tue, Oct 31, 2017 at 9:41 PM, Björn Töpel <bjorn.topel@...il.com> wrote:
> From: Björn Töpel <bjorn.topel@...el.com>
>
> This commits adds support for zerocopy mode. Note that zerocopy mode
> requires that the network interface has been bound to the socket using
> the bind syscall, and that the corresponding netdev implements the
> AF_PACKET V4 ndos.
>
> Signed-off-by: Björn Töpel <bjorn.topel@...el.com>
> ---
> +
> +static void packet_v4_disable_zerocopy(struct net_device *dev,
> +                                      struct tp4_netdev_parms *zc)
> +{
> +       struct tp4_netdev_parms params;
> +
> +       params = *zc;
> +       params.command  = TP4_DISABLE;
> +
> +       (void)dev->netdev_ops->ndo_tp4_zerocopy(dev, &params);

Don't ignore error return codes.

> +static int packet_v4_zerocopy(struct sock *sk, int qp)
> +{
> +       struct packet_sock *po = pkt_sk(sk);
> +       struct socket *sock = sk->sk_socket;
> +       struct tp4_netdev_parms *zc = NULL;
> +       struct net_device *dev;
> +       bool if_up;
> +       int ret = 0;
> +
> +       /* Currently, only RAW sockets are supported.*/
> +       if (sock->type != SOCK_RAW)
> +               return -EINVAL;
> +
> +       rtnl_lock();
> +       dev = packet_cached_dev_get(po);
> +
> +       /* Socket needs to be bound to an interface. */
> +       if (!dev) {
> +               rtnl_unlock();
> +               return -EISCONN;
> +       }
> +
> +       /* The device needs to have both the NDOs implemented. */
> +       if (!(dev->netdev_ops->ndo_tp4_zerocopy &&
> +             dev->netdev_ops->ndo_tp4_xmit)) {
> +               ret = -EOPNOTSUPP;
> +               goto out_unlock;
> +       }

Inconsistent error handling with above test.

> +
> +       if (!(po->rx_ring.pg_vec && po->tx_ring.pg_vec)) {
> +               ret = -EOPNOTSUPP;
> +               goto out_unlock;
> +       }

A ring can be unmapped later with packet_set_ring. Should that operation
fail if zerocopy is enabled? After that, it can also change version with
PACKET_VERSION.

> +
> +       if_up = dev->flags & IFF_UP;
> +       zc = rtnl_dereference(po->zc);
> +
> +       /* Disable */
> +       if (qp <= 0) {
> +               if (!zc)
> +                       goto out_unlock;
> +
> +               packet_v4_disable_zerocopy(dev, zc);
> +               rcu_assign_pointer(po->zc, NULL);
> +
> +               if (if_up) {
> +                       spin_lock(&po->bind_lock);
> +                       register_prot_hook(sk);
> +                       spin_unlock(&po->bind_lock);
> +               }

There have been a bunch of race conditions in this bind code. We need
to be very careful with adding more states to the locking, especially when
open coding in multiple locations, as this patch does. I counted at least
four bind locations. See for instance also
http://patchwork.ozlabs.org/patch/813945/


> +
> +               goto out_unlock;
> +       }
> +
> +       /* Enable */
> +       if (!zc) {
> +               zc = kzalloc(sizeof(*zc), GFP_KERNEL);
> +               if (!zc) {
> +                       ret = -ENOMEM;
> +                       goto out_unlock;
> +               }
> +       }
> +
> +       if (zc->queue_pair >= 0)
> +               packet_v4_disable_zerocopy(dev, zc);

This calls disable even if zc was freshly allocated.
Shoud be > 0?

>  static int packet_release(struct socket *sock)
>  {
> +       struct tp4_netdev_parms *zc;
>         struct sock *sk = sock->sk;
> +       struct net_device *dev;
>         struct packet_sock *po;
>         struct packet_fanout *f;
>         struct net *net;
> @@ -3337,6 +3541,20 @@ static int packet_release(struct socket *sock)
>         sock_prot_inuse_add(net, sk->sk_prot, -1);
>         preempt_enable();
>
> +       rtnl_lock();
> +       zc = rtnl_dereference(po->zc);
> +       dev = packet_cached_dev_get(po);
> +       if (zc && dev)
> +               packet_v4_disable_zerocopy(dev, zc);
> +       if (dev)
> +               dev_put(dev);
> +       rtnl_unlock();
> +
> +       if (zc) {
> +               synchronize_rcu();
> +               kfree(zc);
> +       }

Please use a helper function for anything this complex.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ