[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221227083317.GA1025927@82a1e6c4c19b>
Date: Tue, 27 Dec 2022 08:33:17 +0000
From: Rudi Heitbaum <rudi@...tbaum.com>
To: Pedro Tammela <pctammela@...atatu.com>, netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, jhs@...atatu.com, xiyou.wangcong@...il.com,
jiri@...nulli.us, kuniyu@...zon.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v6 0/4] net/sched: retpoline wrappers for tc
Hi Pedro,
Compiling kernel 6.2-rc1 fails on x86_64 when CONFIG_NET_CLS or
CONFIG_NET_CLS_ACT is not set, when CONFIG_RETPOLINE=y is set.
Does tc_wrapper RETPOLINE need a dependency on NET_CLS/NET_CLS_ACT
to be added? Or a default?
net/sched/sch_api.c: In function 'pktsched_init':
net/sched/sch_api.c:2306:9: error: implicit declaration of function
'tc_wrapper_init' [-Werror=implicit-function-declaration]
2306 | tc_wrapper_init();
| ^~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:252: net/sched/sch_api.o] Error 1
make[2]: *** [scripts/Makefile.build:504: net/sched] Error 2
make[1]: *** [scripts/Makefile.build:504: net] Error 2
below is the relevent lines from the .config file.
$ grep -e RETPOLINE -e NET_CLS projects/Generic/linux/linux.x86_64.conf
CONFIG_RETPOLINE=y
# CONFIG_NET_CLS_BASIC is not set
# CONFIG_NET_CLS_TCINDEX is not set
# CONFIG_NET_CLS_ROUTE4 is not set
# CONFIG_NET_CLS_FW is not set
# CONFIG_NET_CLS_U32 is not set
# CONFIG_NET_CLS_RSVP is not set
# CONFIG_NET_CLS_RSVP6 is not set
# CONFIG_NET_CLS_FLOW is not set
# CONFIG_NET_CLS_CGROUP is not set
# CONFIG_NET_CLS_BPF is not set
# CONFIG_NET_CLS_FLOWER is not set
# CONFIG_NET_CLS_MATCHALL is not set
# CONFIG_NET_CLS_ACT is not set
On Tue, Dec 06, 2022 at 10:55:09AM -0300, Pedro Tammela wrote:
> In tc all qdics, classifiers and actions can be compiled as modules.
> This results today in indirect calls in all transitions in the tc hierarchy.
> Due to CONFIG_RETPOLINE, CPUs with mitigations=on might pay an extra cost on
> indirect calls. For newer Intel cpus with IBRS the extra cost is
> nonexistent, but AMD Zen cpus and older x86 cpus still go through the
> retpoline thunk.
>
> Known built-in symbols can be optimized into direct calls, thus
> avoiding the retpoline thunk. So far, tc has not been leveraging this
> build information and leaving out a performance optimization for some
> CPUs. In this series we wire up 'tcf_classify()' and 'tcf_action_exec()'
> with direct calls when known modules are compiled as built-in as an
> opt-in optimization.
>
> We measured these changes in one AMD Zen 4 cpu (Retpoline), one AMD Zen 3 cpu (Retpoline),
> one Intel 10th Gen CPU (IBRS), one Intel 3rd Gen cpu (Retpoline) and one
> Intel Xeon CPU (IBRS) using pktgen with 64b udp packets. Our test setup is a
> dummy device with clsact and matchall in a kernel compiled with every
> tc module as built-in. We observed a 3-8% speed up on the retpoline CPUs,
> when going through 1 tc filter, and a 60-100% speed up when going through 100 filters.
> For the IBRS cpus we observed a 1-2% degradation in both scenarios, we believe
> the extra branches check introduced a small overhead therefore we added
> a static key that bypasses the wrapper on kernels not using the retpoline mitigation,
> but compiled with CONFIG_RETPOLINE.
...
Powered by blists - more mailing lists