lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 15 Jul 2022 11:06:13 +0200
From:   Eric Dumazet <edumazet@...gle.com>
To:     Yonglong Li <liyonglong@...natelecom.cn>
Cc:     netdev <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Alexander Duyck <alexanderduyck@...com>
Subject: Re: [PATCH v3] net: sort queues in xps maps

echo ff > /sys/class/net/eth0/queues/tx-0/xps_cpus


On Fri, Jul 15, 2022 at 8:04 AM Yonglong Li <liyonglong@...natelecom.cn> wrote:
>
> in the following case that set xps of each tx-queue with same cpu mask,
> packets in the same tcp stream may be hash to different tx queue. Because
> the order of queues in each xps map is not the same.
>
> first set each tx-queue with different cpu mask
> $ echo 0 > /sys/class/net/eth0/queues/tx-0/xps_cpus
> $ echo 1 > /sys/class/net/eth0/queues/tx-1/xps_cpus
> $ echo 2 > /sys/class/net/eth0/queues/tx-2/xps_cpus
> $ echo 4 > /sys/class/net/eth0/queues/tx-3/xps_cpus
> and then set each tx-queue with same cpu mask
> $ echo f > /sys/class/net/eth0/queues/tx-0/xps_cpus
> $ echo f > /sys/class/net/eth0/queues/tx-1/xps_cpus
> $ echo f > /sys/class/net/eth0/queues/tx-2/xps_cpus
> $ echo f > /sys/class/net/eth0/queues/tx-3/xps_cpus

I am not sure why anybody would do that.

>
> at this point the order of each map queues is differnet, It will cause
> packets in the same stream be hashed to diffetent tx queue:
> attr_map[0].queues = [0,1,2,3]
> attr_map[1].queues = [1,0,2,3]
> attr_map[2].queues = [2,0,1,3]
> attr_map[3].queues = [3,0,1,2]

It is not clear to me what is the problem you want to solve.

XPS also has logic to make sure that we do not spread packets of the
same flow into multiple TX queues,
regardless of chosen configuration.

We only care about OOO.

Each skb has skb->ooo_okay for this.

TCP can know if it is okay to select a different TX queue than prior packets.



>
> It is more reasonable that pacekts in the same stream be hashed to the same
> tx queue when all tx queue bind with the same CPUs.

This won't help if you now select

echo ff > /sys/class/net/eth0/queues/tx-0/xps_cpus
echo fc > /sys/class/net/eth0/queues/tx-1/xps_cpus
echo f0 > /sys/class/net/eth0/queues/tx-2/xps_cpus
echo c0 > /sys/class/net/eth0/queues/tx-3/xps_cpus

So really your patch is not needed IMO.

It also tries to correct user configuration errors.

XPS has not been designed to allow arbitrary settings.

Documentation/networking/scaling.rst has this paragraph :

The goal of this mapping is usually to assign queues
exclusively to a subset of CPUs, where the transmit completions for
these queues are processed on a CPU within this set.




>
> v1 -> v2:
> Jakub suggestion: factor out the second loop in __netif_set_xps_queue() -
> starting from the "add tx-queue to CPU/rx-queue maps" comment into a helper
> v2 -> v3:
> keep the skip_tc in __netif_set_xps_queue()
>
> Signed-off-by: Yonglong Li <liyonglong@...natelecom.cn>
> ---
>  net/core/dev.c | 45 +++++++++++++++++++++++++++++----------------
>  1 file changed, 29 insertions(+), 16 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 978ed06..f011513 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -150,6 +150,7 @@
>  #include <linux/pm_runtime.h>
>  #include <linux/prandom.h>
>  #include <linux/once_lite.h>
> +#include <linux/sort.h>
>
>  #include "dev.h"
>  #include "net-sysfs.h"
> @@ -199,6 +200,11 @@ static int call_netdevice_notifiers_extack(unsigned long val,
>
>  static DECLARE_RWSEM(devnet_rename_sem);
>
> +static int cmp_u16(const void *a, const void *b)
> +{
> +       return *(u16 *)a - *(u16 *)b;
> +}
> +
>  static inline void dev_base_seq_inc(struct net *net)
>  {
>         while (++net->dev_base_seq == 0)
> @@ -2537,6 +2543,28 @@ static void xps_copy_dev_maps(struct xps_dev_maps *dev_maps,
>         }
>  }
>
> +static void update_xps_map(struct xps_map *map, int cpu, u16 index,
> +                          int *numa_node_id, enum xps_map_type type)
> +{
> +       int pos = 0;
> +
> +       while ((pos < map->len) && (map->queues[pos] != index))
> +               pos++;
> +
> +       if (pos == map->len)
> +               map->queues[map->len++] = index;
> +
> +       sort(map->queues, map->len, sizeof(u16), cmp_u16, NULL);
> +#ifdef CONFIG_NUMA
> +       if (type == XPS_CPUS) {
> +               if (*numa_node_id == -2)
> +                       *numa_node_id = cpu_to_node(cpu);
> +               else if (*numa_node_id != cpu_to_node(cpu))
> +                       *numa_node_id = -1;
> +       }
> +#endif
> +}
> +
>  /* Must be called under cpus_read_lock */
>  int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
>                           u16 index, enum xps_map_type type)
> @@ -2629,24 +2657,9 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
>                 if (netif_attr_test_mask(j, mask, nr_ids) &&
>                     netif_attr_test_online(j, online_mask, nr_ids)) {
>                         /* add tx-queue to CPU/rx-queue maps */
> -                       int pos = 0;
> -
>                         skip_tc = true;
> -
>                         map = xmap_dereference(new_dev_maps->attr_map[tci]);
> -                       while ((pos < map->len) && (map->queues[pos] != index))
> -                               pos++;
> -
> -                       if (pos == map->len)
> -                               map->queues[map->len++] = index;
> -#ifdef CONFIG_NUMA
> -                       if (type == XPS_CPUS) {
> -                               if (numa_node_id == -2)
> -                                       numa_node_id = cpu_to_node(j);
> -                               else if (numa_node_id != cpu_to_node(j))
> -                                       numa_node_id = -1;
> -                       }
> -#endif
> +                       update_xps_map(map, j, index, &numa_node_id, type);
>                 }
>
>                 if (copy)
> --
> 1.8.3.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ