lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 4 Feb 2021 16:42:29 -0800
From:   Wei Wang <weiwan@...gle.com>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Hannes Frederic Sowa <hannes@...essinduktion.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Felix Fietkau <nbd@....name>,
        Alexander Duyck <alexanderduyck@...com>
Subject: Re: [PATCH net-next v10 3/3] net: add sysfs attribute to control napi
 threaded mode

On Thu, Feb 4, 2021 at 2:17 PM Alexander Duyck
<alexander.duyck@...il.com> wrote:
>
> On Thu, Feb 4, 2021 at 1:35 PM Wei Wang <weiwan@...gle.com> wrote:
> >
> > This patch adds a new sysfs attribute to the network device class.
> > Said attribute provides a per-device control to enable/disable the
> > threaded mode for all the napi instances of the given network device,
> > without the need for a device up/down.
> > User sets it to 1 or 0 to enable or disable threaded mode.
> > Note: when switching between threaded and the current softirq based mode
> > for a napi instance, it will not immediately take effect if the napi is
> > currently being polled. The mode switch will happen for the next time
> > napi_schedule() is called.
> >
> > Co-developed-by: Paolo Abeni <pabeni@...hat.com>
> > Signed-off-by: Paolo Abeni <pabeni@...hat.com>
> > Co-developed-by: Hannes Frederic Sowa <hannes@...essinduktion.org>
> > Signed-off-by: Hannes Frederic Sowa <hannes@...essinduktion.org>
> > Co-developed-by: Felix Fietkau <nbd@....name>
> > Signed-off-by: Felix Fietkau <nbd@....name>
> > Signed-off-by: Wei Wang <weiwan@...gle.com>
> > ---
> >  Documentation/ABI/testing/sysfs-class-net | 15 +++++
> >  include/linux/netdevice.h                 |  2 +
> >  net/core/dev.c                            | 67 ++++++++++++++++++++++-
> >  net/core/net-sysfs.c                      | 45 +++++++++++++++
> >  4 files changed, 127 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net
> > index 1f2002df5ba2..1419103d11f9 100644
> > --- a/Documentation/ABI/testing/sysfs-class-net
> > +++ b/Documentation/ABI/testing/sysfs-class-net
> > @@ -337,3 +337,18 @@ Contact:   netdev@...r.kernel.org
> >  Description:
> >                 32-bit unsigned integer counting the number of times the link has
> >                 been down
> > +
> > +What:          /sys/class/net/<iface>/threaded
> > +Date:          Jan 2021
> > +KernelVersion: 5.12
> > +Contact:       netdev@...r.kernel.org
> > +Description:
> > +               Boolean value to control the threaded mode per device. User could
> > +               set this value to enable/disable threaded mode for all napi
> > +               belonging to this device, without the need to do device up/down.
> > +
> > +               Possible values:
> > +               == ==================================
> > +               0  threaded mode disabled for this dev
> > +               1  threaded mode enabled for this dev
> > +               == ==================================
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 99fb4ec9573e..1340327f7abf 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -497,6 +497,8 @@ static inline bool napi_complete(struct napi_struct *n)
> >         return napi_complete_done(n, 0);
> >  }
> >
> > +int dev_set_threaded(struct net_device *dev, bool threaded);
> > +
> >  /**
> >   *     napi_disable - prevent NAPI from scheduling
> >   *     @n: NAPI context
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index a8c5eca17074..9cc9b245419e 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -4290,8 +4290,9 @@ static inline void ____napi_schedule(struct softnet_data *sd,
> >
> >         if (test_bit(NAPI_STATE_THREADED, &napi->state)) {
> >                 /* Paired with smp_mb__before_atomic() in
> > -                * napi_enable(). Use READ_ONCE() to guarantee
> > -                * a complete read on napi->thread. Only call
> > +                * napi_enable()/napi_set_threaded().
> > +                * Use READ_ONCE() to guarantee a complete
> > +                * read on napi->thread. Only call
> >                  * wake_up_process() when it's not NULL.
> >                  */
> >                 thread = READ_ONCE(napi->thread);
> > @@ -6743,6 +6744,68 @@ static void init_gro_hash(struct napi_struct *napi)
> >         napi->gro_bitmask = 0;
> >  }
> >
> > +/* Setting/unsetting threaded mode on a napi might not immediately
> > + * take effect, if the current napi instance is actively being
> > + * polled. In this case, the switch between threaded mode and
> > + * softirq mode will happen in the next round of napi_schedule().
> > + * This should not cause hiccups/stalls to the live traffic.
> > + */
> > +static int napi_set_threaded(struct napi_struct *n, bool threaded)
> > +{
> > +       int err = 0;
> > +
> > +       if (threaded == !!test_bit(NAPI_STATE_THREADED, &n->state))
> > +               return 0;
> > +
> > +       if (!threaded) {
> > +               clear_bit(NAPI_STATE_THREADED, &n->state);
> > +               return 0;
> > +       }
>
>
> > +
> > +       if (!n->thread) {
> > +               err = napi_kthread_create(n);
> > +               if (err)
> > +                       return err;
> > +       }
>
> This piece needs to be broken out similar to what we did for the
> napi_add and napi enable. In the case where we are enabling the
> threaded NAPI you should first go through and allocate all the
> threads. Then once all the threads are allocated you then enable them
> by setting the NAPI_STATE_THREADED bit.
>
> I would pull this section out and place it in a loop in
> dev_set_threaded to handle creating the threads before you set
> dev->threaded and then set the threaded flags in the napi instances.
>
> > +
> > +       /* Make sure kthread is created before THREADED bit
> > +        * is set.
> > +        */
> > +       smp_mb__before_atomic();
>
> If we split off the allocation like I mentioned earlier we can
> probably pull out the barrier and place it after the allocation of the
> kthreads.
>
> > +       set_bit(NAPI_STATE_THREADED, &n->state);
> > +
> > +       return 0;
> > +}
> > +
>
> With the change I suggested above you could also drop the return type
> from this because it should always succeed.
>
> > +static void dev_disable_threaded_all(struct net_device *dev)
> > +{
> > +       struct napi_struct *napi;
> > +
> > +       list_for_each_entry(napi, &dev->napi_list, dev_list)
> > +               napi_set_threaded(napi, false);
> > +       dev->threaded = 0;
> > +}
> > +
>
> You might consider renaming this to dev_set_threaded_all and passing a
> boolean "threaded" argument. Then you could reuse this code for both
> enabling and disabling of the threaded setup. Another thing I might
> change would be how we go about toggling the value. Maybe something
> more like:
>
> dev->threaded = 0;
> list_for_each_entry(napi, &dev->napi_list, dev_list)
>     napi_set_threaded(napi, threaded);
> dev->threaded = threaded;
>
> The advantage of doing it this way is that we will be
> enabling/disabling all instances at the same time since we are
> overwriting the dev->threaded first and then taking care of the
> individual flags. If you are enabling then after we have set the bits
> we set dev->threaded which will switch them all on, and when we clear
> dev->threaded it will switch them all off so they should stop checking
> the threaded flags in the napi structs.
>
> > +int dev_set_threaded(struct net_device *dev, bool threaded)
> > +{
> > +       struct napi_struct *napi;
> > +       int ret;
> > +
>
> There are probably a couple changes that need to be made. First I
> would initialize ret to 0 here. More on that below.
>
> Second we need to add a check for if we are actually changing
> anything. If not then we shouldn't be bothering. So something like the
> following at the start should take care of that:
>
> if (!dev->threaded == !threaded)
>     return 0;
>
> I would add a list_for_each_entry loop that does the thread allocation
> I called out above and will break if an error is encountered. That way
> you aren't toggling dev->threaded unless you know all of the instances
> can succeed. Something like:
>
> if (!dev->threaded) {
>     list_for_each_entry(napi, &dev->napi_list, dev_list) {
>         ret =  napi_kthread_create(napi);
>         if (ret)
>             break;
>     }
> }
>
> /* Make sure kthread is created before THREADED bit is set. */
> smp_mb__before_atomic();
>
> > +       dev->threaded = threaded;
> > +       list_for_each_entry(napi, &dev->napi_list, dev_list) {
> > +               ret = napi_set_threaded(napi, threaded);
> > +               if (ret) {
> > +                       /* Error occurred on one of the napi,
> > +                        * reset threaded mode on all napi.
> > +                        */
> > +                       dev_disable_threaded_all(dev);
> > +                       break;
> > +               }
> > +       }
>
> With my suggestion above you could look at just making the
> dev_disable_threaded_all accept a boolean argument for enable/disable
> and then you could just do something like:
>
> dev_set_threaded_all(struct net_device *dev, !ret && threaded);
>

Ack for the above suggestions. I think I used a combined function
napi_set_threaded() to create kthread + set threaded bit here because
I have the knowledge of all the napi instances belonging to dev here.
And if anyone fails, I would just disable them all afterwards. But I
think that might introduce an inconsistent stage where some napi are
in threaded mode, and some are in softirq mode.
I agree it is cleaner to first create kthreads for all napi, then set
the threaded bit for all.

> > +
> > +       return ret;
> > +}
> > +
> >  void netif_napi_add(struct net_device *dev, struct napi_struct *napi,
> >                     int (*poll)(struct napi_struct *, int), int weight)
> >  {
> > diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> > index daf502c13d6d..969743567257 100644
> > --- a/net/core/net-sysfs.c
> > +++ b/net/core/net-sysfs.c
> > @@ -538,6 +538,50 @@ static ssize_t phys_switch_id_show(struct device *dev,
> >  }
> >  static DEVICE_ATTR_RO(phys_switch_id);
> >
> > +static ssize_t threaded_show(struct device *dev,
> > +                            struct device_attribute *attr, char *buf)
> > +{
> > +       struct net_device *netdev = to_net_dev(dev);
> > +       int ret;
> > +
> > +       if (!rtnl_trylock())
> > +               return restart_syscall();
> > +
> > +       if (!dev_isalive(netdev)) {
> > +               ret = -EINVAL;
> > +               goto unlock;
> > +       }
> > +
> > +       ret = sprintf(buf, fmt_dec, netdev->threaded);
> > +
>
> It seems like the more standard pattern is to initialize ret to
> -EVINAL and just call the sprintf if dev_isalive is true. That would
> allow you to drop the unlock jump label you had to add below.
>
Ack.

> > +unlock:
> > +       rtnl_unlock();
> > +       return ret;
> > +}
> > +
> > +static int modify_napi_threaded(struct net_device *dev, unsigned long val)
> > +{
> > +       int ret;
> > +
> > +       if (list_empty(&dev->napi_list))
> > +               return -EOPNOTSUPP;
> > +
> > +       if (val != 0 && val != 1)
> > +               return -EOPNOTSUPP;
> > +
> > +       ret = dev_set_threaded(dev, val);
> > +
> > +       return ret;
> > +}
> > +
> > +static ssize_t threaded_store(struct device *dev,
> > +                             struct device_attribute *attr,
> > +                             const char *buf, size_t len)
> > +{
> > +       return netdev_store(dev, attr, buf, len, modify_napi_threaded);
> > +}
> > +static DEVICE_ATTR_RW(threaded);
> > +
> >  static struct attribute *net_class_attrs[] __ro_after_init = {
> >         &dev_attr_netdev_group.attr,
> >         &dev_attr_type.attr,
> > @@ -570,6 +614,7 @@ static struct attribute *net_class_attrs[] __ro_after_init = {
> >         &dev_attr_proto_down.attr,
> >         &dev_attr_carrier_up_count.attr,
> >         &dev_attr_carrier_down_count.attr,
> > +       &dev_attr_threaded.attr,
> >         NULL,
> >  };
> >  ATTRIBUTE_GROUPS(net_class);
> > --
> > 2.30.0.365.g02bc693789-goog
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ