[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAADnVQKAT3UPzcpzkJ6_-powz4YTiDAku4-a+++hrhYdJUnLiw@mail.gmail.com>
Date: Mon, 23 Jun 2025 14:32:31 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Arnd Bergmann <arnd@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>
Cc: Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>, Nathan Chancellor <nathan@...nel.org>, Arnd Bergmann <arnd@...db.de>,
John Fastabend <john.fastabend@...il.com>, Martin KaFai Lau <martin.lau@...ux.dev>,
Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>, KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
Nick Desaulniers <nick.desaulniers+lkml@...il.com>, Bill Wendling <morbo@...gle.com>,
Justin Stitt <justinstitt@...gle.com>, Kumar Kartikeya Dwivedi <memxor@...il.com>,
Luis Gerhorst <luis.gerhorst@....de>, bpf <bpf@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, clang-built-linux <llvm@...ts.linux.dev>
Subject: Re: [PATCH] bpf: turn off sanitizer in do_misc_fixups for old clang
On Fri, Jun 20, 2025 at 4:38 AM Arnd Bergmann <arnd@...nel.org> wrote:
>
> From: Arnd Bergmann <arnd@...db.de>
>
> clang versions before version 18 manage to badly optimize the bpf
> verifier, with lots of variable spills leading to excessive stack
> usage in addition to likely rather slow code:
>
> kernel/bpf/verifier.c:23936:5: error: stack frame size (2096) exceeds limit (1280) in 'bpf_check' [-Werror,-Wframe-larger-than]
> kernel/bpf/verifier.c:21563:12: error: stack frame size (1984) exceeds limit (1280) in 'do_misc_fixups' [-Werror,-Wframe-larger-than]
>
> Turn off the sanitizer in the two functions that suffer the most from
> this when using one of the affected clang version.
>
> Signed-off-by: Arnd Bergmann <arnd@...db.de>
> ---
> kernel/bpf/verifier.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 2fa797a6d6a2..7724c7a56d79 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -19810,7 +19810,14 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
> return 0;
> }
>
> -static int do_check(struct bpf_verifier_env *env)
> +#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 180100
> +/* old clang versions cause excessive stack usage here */
> +#define __workaround_kasan __disable_sanitizer_instrumentation
> +#else
> +#define __workaround_kasan
> +#endif
> +
> +static __workaround_kasan int do_check(struct bpf_verifier_env *env)
This looks too hacky for a workaround.
Let's figure out what's causing such excessive stack usage and fix it.
We did some of this work in
commit 6f606ffd6dd7 ("bpf: Move insn_buf[16] to bpf_verifier_env")
and similar.
Looks like it wasn't enough or more stack usage crept in since then.
Also make sure you're using the latest bpf-next.
A bunch of code was moved out of do_check().
So I bet the current bpf-next/master doesn't have a problem
with this particular function.
In my kasan build do_check() is now fully inlined.
do_check_common() is not and it's using 512 bytes of stack.
> {
> bool pop_log = !(env->log.level & BPF_LOG_LEVEL2);
> struct bpf_verifier_state *state = env->cur_state;
> @@ -21817,7 +21824,7 @@ static int add_hidden_subprog(struct bpf_verifier_env *env, struct bpf_insn *pat
> /* Do various post-verification rewrites in a single program pass.
> * These rewrites simplify JIT and interpreter implementations.
> */
> -static int do_misc_fixups(struct bpf_verifier_env *env)
> +static __workaround_kasan int do_misc_fixups(struct bpf_verifier_env *env)
This one is using 832 byte of stack with kasan.
Which is indeed high.
Big chunk seems to be coming from chk_and_sdiv[] and chk_and_smod[].
Yonghong,
looks like you contributed that piece of code.
Pls see how to reduce stack size here.
Daniel used this pattern in earlier commits. Looks like
we took it too far.
Powered by blists - more mailing lists