[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADvbK_fkcw4XkT4zhb0Db5qH9q_yFMWmRgKMrHQvmVH+CMY=7g@mail.gmail.com>
Date: Fri, 17 Jan 2025 13:32:53 -0500
From: Xin Long <lucien.xin@...il.com>
To: Asbjørn Sloth Tønnesen <ast@...erby.net>
Cc: network dev <netdev@...r.kernel.org>, "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Jamal Hadi Salim <jhs@...atatu.com>, Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>, Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Shuang Li <shuali@...hat.com>
Subject: Re: [PATCH v3 net-next] net: sched: refine software bypass handling
in tc_run
On Fri, Jan 17, 2025 at 10:08 AM Asbjørn Sloth Tønnesen <ast@...erby.net> wrote:
>
> On 1/15/25 2:27 PM, Xin Long wrote:
> > This patch addresses issues with filter counting in block (tcf_block),
> > particularly for software bypass scenarios, by introducing a more
> > accurate mechanism using useswcnt.
> >
> > [...]
> > The improvement can be demonstrated using the following script:
> >
> > # cat insert_tc_rules.sh
> >
> > tc qdisc add dev ens1f0np0 ingress
> > for i in $(seq 16); do
> > taskset -c $i tc -b rules_$i.txt &
> > done
> > wait
> >
> > Each of rules_$i.txt files above includes 100000 tc filter rules to a
> > mlx5 driver NIC ens1f0np0.
> >
> > Without this patch:
> >
> > # time sh insert_tc_rules.sh
> >
> > real 0m50.780s
> > user 0m23.556s
> > sys 4m13.032s
> >
> > With this patch:
> >
> > # time sh insert_tc_rules.sh
> >
> > real 0m17.718s
> > user 0m7.807s
> > sys 3m45.050s
>
> I assume that you have tested that these numbers are still roughly the same for v3?
Yup, roughly the same.
>
> > [...]
> > DEFINE_STATIC_KEY_FALSE(netstamp_needed_key);
> > @@ -4045,10 +4045,13 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> > if (!miniq)
> > return ret;
> >
> > - if (static_branch_unlikely(&tcf_bypass_check_needed_key)) {
> > - if (tcf_block_bypass_sw(miniq->block))
> > - return ret;
> > - }
> > + /* Global bypass */
> > + if (!static_branch_likely(&tcf_sw_enabled_key))
> > + return ret;
>
> I have tested with both static_branch_likely() and static_branch_unlikely(),
> but my results are inconclusive, I don't see a significant difference in my tests,
> but it cases a lot of changes in the object code.
>
> $ diff -Naur <(objdump --no-addresses -d dev_likely.o) \
> <(objdump --no-addresses -d dev_unlikely.o) | diffstat
> 62 | 156 ++++++++++++++++++++++++++++++++++----------------------------------
> 1 file changed, 79 insertions(+), 77 deletions(-)
>
> > +
> > + /* Block-wise bypass */
> > + if (tcf_block_bypass_sw(miniq->block))
> > + return ret;
> >
> > tc_skb_cb(skb)->mru = 0;
> > tc_skb_cb(skb)->post_ct = false;
> > [...]
>
> When I run the benchmark tests from my original bypass patch last year,
> I don't see any significant differences in the forwarding performance.
> (Xeon D-1518, single 8-core CPU, no parallel rule updates).
>
> Reviewed-by: Asbjørn Sloth Tønnesen <ast@...erby.net>
> Tested-by: Asbjørn Sloth Tønnesen <ast@...erby.net>
Powered by blists - more mailing lists