netdev - Re: [PATCH v3 net-next] net: sched: refine software bypass handling in tc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <741cd05e-e62a-4d72-b85f-ebc627b1e4d3@fiberby.net>
Date: Fri, 17 Jan 2025 15:07:57 +0000
From: Asbjørn Sloth Tønnesen <ast@...erby.net>
To: Xin Long <lucien.xin@...il.com>, network dev <netdev@...r.kernel.org>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski
 <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
 Paolo Abeni <pabeni@...hat.com>, Jamal Hadi Salim <jhs@...atatu.com>,
 Cong Wang <xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>,
 Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
 Shuang Li <shuali@...hat.com>
Subject: Re: [PATCH v3 net-next] net: sched: refine software bypass handling
 in tc_run

On 1/15/25 2:27 PM, Xin Long wrote:
> This patch addresses issues with filter counting in block (tcf_block),
> particularly for software bypass scenarios, by introducing a more
> accurate mechanism using useswcnt.
> 
> [...]
>    The improvement can be demonstrated using the following script:
> 
>    # cat insert_tc_rules.sh
> 
>      tc qdisc add dev ens1f0np0 ingress
>      for i in $(seq 16); do
>          taskset -c $i tc -b rules_$i.txt &
>      done
>      wait
> 
>    Each of rules_$i.txt files above includes 100000 tc filter rules to a
>    mlx5 driver NIC ens1f0np0.
> 
>    Without this patch:
> 
>    # time sh insert_tc_rules.sh
> 
>      real    0m50.780s
>      user    0m23.556s
>      sys	    4m13.032s
> 
>    With this patch:
> 
>    # time sh insert_tc_rules.sh
> 
>      real    0m17.718s
>      user    0m7.807s
>      sys     3m45.050s

I assume that you have tested that these numbers are still roughly the same for v3?

> [...]
>   DEFINE_STATIC_KEY_FALSE(netstamp_needed_key);
> @@ -4045,10 +4045,13 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>   	if (!miniq)
>   		return ret;
>   
> -	if (static_branch_unlikely(&tcf_bypass_check_needed_key)) {
> -		if (tcf_block_bypass_sw(miniq->block))
> -			return ret;
> -	}
> +	/* Global bypass */
> +	if (!static_branch_likely(&tcf_sw_enabled_key))
> +		return ret;

I have tested with both static_branch_likely() and static_branch_unlikely(),
but my results are inconclusive, I don't see a significant difference in my tests,
but it cases a lot of changes in the object code.

$ diff -Naur <(objdump --no-addresses -d dev_likely.o) \
              <(objdump --no-addresses -d dev_unlikely.o) | diffstat
  62 |  156 ++++++++++++++++++++++++++++++++++----------------------------------
  1 file changed, 79 insertions(+), 77 deletions(-)

> +
> +	/* Block-wise bypass */
> +	if (tcf_block_bypass_sw(miniq->block))
> +		return ret;
>   
>   	tc_skb_cb(skb)->mru = 0;
>   	tc_skb_cb(skb)->post_ct = false;
> [...]

When I run the benchmark tests from my original bypass patch last year,
I don't see any significant differences in the forwarding performance.
(Xeon D-1518, single 8-core CPU, no parallel rule updates).

Reviewed-by: Asbjørn Sloth Tønnesen <ast@...erby.net>
Tested-by: Asbjørn Sloth Tønnesen <ast@...erby.net>