lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon,  6 Dec 2021 16:05:11 +0800
From:   xiangxia.m.yue@...il.com
To:     netdev@...r.kernel.org
Cc:     Tonghao Zhang <xiangxia.m.yue@...il.com>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Cong Wang <xiyou.wangcong@...il.com>,
        Jiri Pirko <jiri@...nulli.us>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Jonathan Lemon <jonathan.lemon@...il.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Alexander Lobakin <alobakin@...me>,
        Paolo Abeni <pabeni@...hat.com>,
        Talal Ahmad <talalahmad@...gle.com>,
        Kevin Hao <haokexin@...il.com>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        Kees Cook <keescook@...omium.org>,
        Kumar Kartikeya Dwivedi <memxor@...il.com>,
        Antoine Tenart <atenart@...nel.org>,
        Wei Wang <weiwan@...gle.com>, Arnd Bergmann <arnd@...db.de>
Subject: [net-next v1 1/2] net: sched: use queue_mapping to pick tx queue

From: Tonghao Zhang <xiangxia.m.yue@...il.com>

This patch fix issue:
* If we install tc filters with act_skbedit in clsact hook.
  It doesn't work, because *netdev_core_pick_tx will overwrite
  queue_mapping.

  $ tc filter add dev $NETDEV egress .. action skbedit queue_mapping 1

And this patch is useful:
* In containter networking environment, one kind of pod/containter/
  net-namespace (e.g. P1, P2) which outbound traffic limited, can
  use one specific tx queue which used HTB/TBF Qdisc. But other kind
  of pods (e.g. Pn) can use other specific tx queue too, which used fifio
  Qdisc. Then the lock contention of HTB/TBF Qdisc will not affect Pn.

  +----+      +----+      +----+
  | P1 |      | P2 |      | Pn |
  +----+      +----+      +----+
    |           |           |
    +-----------+-----------+
                |
                | clsact/skbedit
                |    MQ
                v
    +-----------+-----------+
    | q0        | q1        | qn
    v           v           v
   HTB         HTB   ...   FIFO

Cc: Jamal Hadi Salim <jhs@...atatu.com>
Cc: Cong Wang <xiyou.wangcong@...il.com>
Cc: Jiri Pirko <jiri@...nulli.us>
Cc: "David S. Miller" <davem@...emloft.net>
Cc: Jakub Kicinski <kuba@...nel.org>
Cc: Jonathan Lemon <jonathan.lemon@...il.com>
Cc: Eric Dumazet <edumazet@...gle.com>
Cc: Alexander Lobakin <alobakin@...me>
Cc: Paolo Abeni <pabeni@...hat.com>
Cc: Talal Ahmad <talalahmad@...gle.com>
Cc: Kevin Hao <haokexin@...il.com>
Cc: Ilias Apalodimas <ilias.apalodimas@...aro.org>
Cc: Kees Cook <keescook@...omium.org>
Cc: Kumar Kartikeya Dwivedi <memxor@...il.com>
Cc: Antoine Tenart <atenart@...nel.org>
Cc: Wei Wang <weiwan@...gle.com>
Cc: Arnd Bergmann <arnd@...db.de>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@...il.com>
---
 include/linux/skbuff.h  |  1 +
 net/core/dev.c          | 12 +++++++++---
 net/sched/act_skbedit.c |  4 +++-
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index eae4bd3237a4..b6ea4b920409 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -856,6 +856,7 @@ struct sk_buff {
 #endif
 #ifdef CONFIG_NET_CLS_ACT
 	__u8			tc_skip_classify:1;
+	__u8			tc_skip_txqueue:1;
 	__u8			tc_at_ingress:1;
 #endif
 	__u8			redirected:1;
diff --git a/net/core/dev.c b/net/core/dev.c
index aba8acc1238c..fb9d4eee29ee 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3975,10 +3975,16 @@ struct netdev_queue *netdev_core_pick_tx(struct net_device *dev,
 {
 	int queue_index = 0;
 
-#ifdef CONFIG_XPS
-	u32 sender_cpu = skb->sender_cpu - 1;
+#ifdef CONFIG_NET_CLS_ACT
+	if (skb->tc_skip_txqueue) {
+		queue_index = netdev_cap_txqueue(dev,
+						 skb_get_queue_mapping(skb));
+		return netdev_get_tx_queue(dev, queue_index);
+	}
+#endif
 
-	if (sender_cpu >= (u32)NR_CPUS)
+#ifdef CONFIG_XPS
+	if ((skb->sender_cpu - 1) >= (u32)NR_CPUS)
 		skb->sender_cpu = raw_smp_processor_id() + 1;
 #endif
 
diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index d30ecbfc8f84..940091a7c7f0 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -58,8 +58,10 @@ static int tcf_skbedit_act(struct sk_buff *skb, const struct tc_action *a,
 		}
 	}
 	if (params->flags & SKBEDIT_F_QUEUE_MAPPING &&
-	    skb->dev->real_num_tx_queues > params->queue_mapping)
+	    skb->dev->real_num_tx_queues > params->queue_mapping) {
+		skb->tc_skip_txqueue = 1;
 		skb_set_queue_mapping(skb, params->queue_mapping);
+	}
 	if (params->flags & SKBEDIT_F_MARK) {
 		skb->mark &= ~params->mask;
 		skb->mark |= params->mark & params->mask;
-- 
2.27.0

Powered by blists - more mailing lists