[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-KHFNm=nTJ9Ey2Hv-YLzSBXEnJZw+ruFU0fKBX9K=v0VQ@mail.gmail.com>
Date: Fri, 3 Nov 2017 12:17:07 +0900
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Björn Töpel <bjorn.topel@...il.com>
Cc: "Karlsson, Magnus" <magnus.karlsson@...el.com>,
Alexander Duyck <alexander.h.duyck@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
John Fastabend <john.fastabend@...il.com>,
Alexei Starovoitov <ast@...com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
michael.lundkvist@...csson.com, ravineet.singh@...csson.com,
Daniel Borkmann <daniel@...earbox.net>,
Network Development <netdev@...r.kernel.org>,
Björn Töpel <bjorn.topel@...el.com>,
jesse.brandeburg@...el.com, anjali.singhai@...el.com,
rami.rosen@...el.com, jeffrey.b.shaw@...el.com,
ferruh.yigit@...el.com, qi.z.zhang@...el.com
Subject: Re: [RFC PATCH 07/14] packet: wire up zerocopy for AF_PACKET V4
On Tue, Oct 31, 2017 at 9:41 PM, Björn Töpel <bjorn.topel@...il.com> wrote:
> From: Björn Töpel <bjorn.topel@...el.com>
>
> This commits adds support for zerocopy mode. Note that zerocopy mode
> requires that the network interface has been bound to the socket using
> the bind syscall, and that the corresponding netdev implements the
> AF_PACKET V4 ndos.
>
> Signed-off-by: Björn Töpel <bjorn.topel@...el.com>
> ---
> +
> +static void packet_v4_disable_zerocopy(struct net_device *dev,
> + struct tp4_netdev_parms *zc)
> +{
> + struct tp4_netdev_parms params;
> +
> + params = *zc;
> + params.command = TP4_DISABLE;
> +
> + (void)dev->netdev_ops->ndo_tp4_zerocopy(dev, ¶ms);
Don't ignore error return codes.
> +static int packet_v4_zerocopy(struct sock *sk, int qp)
> +{
> + struct packet_sock *po = pkt_sk(sk);
> + struct socket *sock = sk->sk_socket;
> + struct tp4_netdev_parms *zc = NULL;
> + struct net_device *dev;
> + bool if_up;
> + int ret = 0;
> +
> + /* Currently, only RAW sockets are supported.*/
> + if (sock->type != SOCK_RAW)
> + return -EINVAL;
> +
> + rtnl_lock();
> + dev = packet_cached_dev_get(po);
> +
> + /* Socket needs to be bound to an interface. */
> + if (!dev) {
> + rtnl_unlock();
> + return -EISCONN;
> + }
> +
> + /* The device needs to have both the NDOs implemented. */
> + if (!(dev->netdev_ops->ndo_tp4_zerocopy &&
> + dev->netdev_ops->ndo_tp4_xmit)) {
> + ret = -EOPNOTSUPP;
> + goto out_unlock;
> + }
Inconsistent error handling with above test.
> +
> + if (!(po->rx_ring.pg_vec && po->tx_ring.pg_vec)) {
> + ret = -EOPNOTSUPP;
> + goto out_unlock;
> + }
A ring can be unmapped later with packet_set_ring. Should that operation
fail if zerocopy is enabled? After that, it can also change version with
PACKET_VERSION.
> +
> + if_up = dev->flags & IFF_UP;
> + zc = rtnl_dereference(po->zc);
> +
> + /* Disable */
> + if (qp <= 0) {
> + if (!zc)
> + goto out_unlock;
> +
> + packet_v4_disable_zerocopy(dev, zc);
> + rcu_assign_pointer(po->zc, NULL);
> +
> + if (if_up) {
> + spin_lock(&po->bind_lock);
> + register_prot_hook(sk);
> + spin_unlock(&po->bind_lock);
> + }
There have been a bunch of race conditions in this bind code. We need
to be very careful with adding more states to the locking, especially when
open coding in multiple locations, as this patch does. I counted at least
four bind locations. See for instance also
http://patchwork.ozlabs.org/patch/813945/
> +
> + goto out_unlock;
> + }
> +
> + /* Enable */
> + if (!zc) {
> + zc = kzalloc(sizeof(*zc), GFP_KERNEL);
> + if (!zc) {
> + ret = -ENOMEM;
> + goto out_unlock;
> + }
> + }
> +
> + if (zc->queue_pair >= 0)
> + packet_v4_disable_zerocopy(dev, zc);
This calls disable even if zc was freshly allocated.
Shoud be > 0?
> static int packet_release(struct socket *sock)
> {
> + struct tp4_netdev_parms *zc;
> struct sock *sk = sock->sk;
> + struct net_device *dev;
> struct packet_sock *po;
> struct packet_fanout *f;
> struct net *net;
> @@ -3337,6 +3541,20 @@ static int packet_release(struct socket *sock)
> sock_prot_inuse_add(net, sk->sk_prot, -1);
> preempt_enable();
>
> + rtnl_lock();
> + zc = rtnl_dereference(po->zc);
> + dev = packet_cached_dev_get(po);
> + if (zc && dev)
> + packet_v4_disable_zerocopy(dev, zc);
> + if (dev)
> + dev_put(dev);
> + rtnl_unlock();
> +
> + if (zc) {
> + synchronize_rcu();
> + kfree(zc);
> + }
Please use a helper function for anything this complex.
Powered by blists - more mailing lists