[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNhb=wn1XymE4L6QaE--Z_i2s+bDH=wjCBhHMK4gnVw0wg@mail.gmail.com>
Date: Fri, 3 Nov 2017 11:47:24 +0100
From: Björn Töpel <bjorn.topel@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: "Karlsson, Magnus" <magnus.karlsson@...el.com>,
Alexander Duyck <alexander.h.duyck@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
John Fastabend <john.fastabend@...il.com>,
Alexei Starovoitov <ast@...com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
michael.lundkvist@...csson.com, ravineet.singh@...csson.com,
Daniel Borkmann <daniel@...earbox.net>,
Network Development <netdev@...r.kernel.org>,
Björn Töpel <bjorn.topel@...el.com>,
jesse.brandeburg@...el.com, anjali.singhai@...el.com,
rami.rosen@...el.com, jeffrey.b.shaw@...el.com,
ferruh.yigit@...el.com, qi.z.zhang@...el.com
Subject: Re: [RFC PATCH 07/14] packet: wire up zerocopy for AF_PACKET V4
2017-11-03 4:17 GMT+01:00 Willem de Bruijn <willemdebruijn.kernel@...il.com>:
> On Tue, Oct 31, 2017 at 9:41 PM, Björn Töpel <bjorn.topel@...il.com> wrote:
>> From: Björn Töpel <bjorn.topel@...el.com>
>>
>> This commits adds support for zerocopy mode. Note that zerocopy mode
>> requires that the network interface has been bound to the socket using
>> the bind syscall, and that the corresponding netdev implements the
>> AF_PACKET V4 ndos.
>>
>> Signed-off-by: Björn Töpel <bjorn.topel@...el.com>
>> ---
>> +
>> +static void packet_v4_disable_zerocopy(struct net_device *dev,
>> + struct tp4_netdev_parms *zc)
>> +{
>> + struct tp4_netdev_parms params;
>> +
>> + params = *zc;
>> + params.command = TP4_DISABLE;
>> +
>> + (void)dev->netdev_ops->ndo_tp4_zerocopy(dev, ¶ms);
>
> Don't ignore error return codes.
>
Will fix!
>> +static int packet_v4_zerocopy(struct sock *sk, int qp)
>> +{
>> + struct packet_sock *po = pkt_sk(sk);
>> + struct socket *sock = sk->sk_socket;
>> + struct tp4_netdev_parms *zc = NULL;
>> + struct net_device *dev;
>> + bool if_up;
>> + int ret = 0;
>> +
>> + /* Currently, only RAW sockets are supported.*/
>> + if (sock->type != SOCK_RAW)
>> + return -EINVAL;
>> +
>> + rtnl_lock();
>> + dev = packet_cached_dev_get(po);
>> +
>> + /* Socket needs to be bound to an interface. */
>> + if (!dev) {
>> + rtnl_unlock();
>> + return -EISCONN;
>> + }
>> +
>> + /* The device needs to have both the NDOs implemented. */
>> + if (!(dev->netdev_ops->ndo_tp4_zerocopy &&
>> + dev->netdev_ops->ndo_tp4_xmit)) {
>> + ret = -EOPNOTSUPP;
>> + goto out_unlock;
>> + }
>
> Inconsistent error handling with above test.
>
Will fix.
>> +
>> + if (!(po->rx_ring.pg_vec && po->tx_ring.pg_vec)) {
>> + ret = -EOPNOTSUPP;
>> + goto out_unlock;
>> + }
>
> A ring can be unmapped later with packet_set_ring. Should that operation
> fail if zerocopy is enabled? After that, it can also change version with
> PACKET_VERSION.
>
You're correct, I've missed this. I need to revisit the scenario when
a ring is unmapped, and recreated. Thanks for pointing this out.
>> +
>> + if_up = dev->flags & IFF_UP;
>> + zc = rtnl_dereference(po->zc);
>> +
>> + /* Disable */
>> + if (qp <= 0) {
>> + if (!zc)
>> + goto out_unlock;
>> +
>> + packet_v4_disable_zerocopy(dev, zc);
>> + rcu_assign_pointer(po->zc, NULL);
>> +
>> + if (if_up) {
>> + spin_lock(&po->bind_lock);
>> + register_prot_hook(sk);
>> + spin_unlock(&po->bind_lock);
>> + }
>
> There have been a bunch of race conditions in this bind code. We need
> to be very careful with adding more states to the locking, especially when
> open coding in multiple locations, as this patch does. I counted at least
> four bind locations. See for instance also
> http://patchwork.ozlabs.org/patch/813945/
>
Yeah, the locking schemes in AF_PACKET is pretty convoluted. I'll
document and make the locking more explicit (and avoiding open coding
it).
>
>> +
>> + goto out_unlock;
>> + }
>> +
>> + /* Enable */
>> + if (!zc) {
>> + zc = kzalloc(sizeof(*zc), GFP_KERNEL);
>> + if (!zc) {
>> + ret = -ENOMEM;
>> + goto out_unlock;
>> + }
>> + }
>> +
>> + if (zc->queue_pair >= 0)
>> + packet_v4_disable_zerocopy(dev, zc);
>
> This calls disable even if zc was freshly allocated.
> Shoud be > 0?
>
Good catch. It should be > 0.
>> static int packet_release(struct socket *sock)
>> {
>> + struct tp4_netdev_parms *zc;
>> struct sock *sk = sock->sk;
>> + struct net_device *dev;
>> struct packet_sock *po;
>> struct packet_fanout *f;
>> struct net *net;
>> @@ -3337,6 +3541,20 @@ static int packet_release(struct socket *sock)
>> sock_prot_inuse_add(net, sk->sk_prot, -1);
>> preempt_enable();
>>
>> + rtnl_lock();
>> + zc = rtnl_dereference(po->zc);
>> + dev = packet_cached_dev_get(po);
>> + if (zc && dev)
>> + packet_v4_disable_zerocopy(dev, zc);
>> + if (dev)
>> + dev_put(dev);
>> + rtnl_unlock();
>> +
>> + if (zc) {
>> + synchronize_rcu();
>> + kfree(zc);
>> + }
>
> Please use a helper function for anything this complex.
Will fix.
Thanks,
Björn
Powered by blists - more mailing lists