[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <C66C764E-C898-457D-93F0-A680983707F0@kernel.org>
Date: Thu, 15 May 2025 07:51:26 -0700
From: Kees Cook <kees@...nel.org>
To: Shung-Hsi Yu <shung-hsi.yu@...e.com>, bpf@...r.kernel.org, linux-mm@...ck.org,
Andrii Nakryiko <andrii@...nel.org>, Ihor Solodrai <ihor.solodrai@...ux.dev>
CC: Andrew Morton <akpm@...ux-foundation.org>, Michal Hocko <mhocko@...e.com>,
Vlastimil Babka <vbabka@...e.cz>, Uladzislau Rezki <urezki@...il.com>,
linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org,
regressions@...ts.linux.dev, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Eduard Zingerman <eddyz87@...il.com>
Subject: Re: [REGRESSION] bpf verifier slowdown due to vrealloc() change since 6.15-rc6
On May 15, 2025 6:12:25 AM PDT, Shung-Hsi Yu <shung-hsi.yu@...e.com> wrote:
>Hi,
>
>There is an observable slowdown when running BPF selftests on 6.15-rc6
>kernel[1] built with tools/testing/selftests/bpf/{config,config.x86_64}.
>Overall the BPF selftests now takes 2x time to run (from ~25m to ~50m),
>and for the verif_scale_loop3_fail it went from single digit seconds to
>6 minutes.
>
>Bisect was done by Pawan and got to commit a0309faf1cb0 "mm: vmalloc:
>support more granular vrealloc() sizing"[2]. To further zoom in the
>issue, I tried removing the only kvrealloc() call in kernel/bpf/ by
>reverting commit 96a30e469ca1 "bpf: use common instruction history
>across all states", so _krealloc()_ was used instead of kvrealloc(), and
>observe that there is _no_ slowdown[3]. While the bisect and the revert
>is done on 6.14.7-rc2, I think it should stll be pretty representitive.
>
>In short, the follow were tested:
>- 6.15-rc6 (has a0309faf1cb0) -> slowdown
>- 6.14.7-rc2 (has a0309faf1cb0) -> slowdown
>- 6.14.7-rc2 (has a0309faf1cb0, call to kvrealloc in
> kernel/bpf/verifier.c replaced with krealloc) -> _no_ slowdown
>
>And the vrealloc() change is causing slowdown in kvrealloc() call within
>push_insn_history().
This is very strange! The vrealloc change should make things faster -- it removes potentially unneeded vmalloc and full object copies when it isn't needed.
Where can I find the .config for the slow runs?
And how do I run the test myself directly?
-Kees
>
> /* for any branch, call, exit record the history of jmps in the given state */
> static int push_insn_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur,
> int insn_flags, u64 linked_regs)
> {
> struct bpf_insn_hist_entry *p;
> size_t alloc_size;
> ...
> if (cur->insn_hist_end + 1 > env->insn_hist_cap) {
> alloc_size = size_mul(cur->insn_hist_end + 1, sizeof(*p));
> p = kvrealloc(env->insn_hist, alloc_size, GFP_USER);
> if (!p)
> return -ENOMEM;
> env->insn_hist = p;
> env->insn_hist_cap = alloc_size / sizeof(*p);
> }
>
> p = &env->insn_hist[cur->insn_hist_end];
> p->idx = env->insn_idx;
> p->prev_idx = env->prev_insn_idx;
> p->flags = insn_flags;
> p->linked_regs = linked_regs;
>
> cur->insn_hist_end++;
> env->cur_hist_ent = p;
>
> return 0;
> }
>
>BPF CI probably hasn't hit this yet because bpf-next have only got to
>6.15-rc4.
>
>Shung-Hsi
>
>#regzbot introduced: a0309faf1cb0622cac7c820150b7abf2024acff5
>
>1: https://github.com/shunghsiyu/libbpf/actions/runs/15038992168/job/42266125686
>2: https://lore.kernel.org/stable/20250515041659.smhllyarxdwp7cav@desk/
>3: https://github.com/shunghsiyu/libbpf/actions/runs/15043433548/job/42280277024
--
Kees Cook
Powered by blists - more mailing lists