linux-kernel - Re: [PATCH v2] bpf/verifier: implement slab cache for verifier state list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6205668.MhkbZ0Pkbq@7950hx>
Date: Sat, 17 Jan 2026 19:27:53 +0800
From: Menglong Dong <menglong.dong@...ux.dev>
To: Qiliang Yuan <realwujing@...il.com>
Cc: alexei.starovoitov@...il.com, andrii@...nel.org, ast@...nel.org,
 bpf@...r.kernel.org, daniel@...earbox.net, eddyz87@...il.com,
 haoluo@...gle.com, john.fastabend@...il.com, jolsa@...nel.org,
 kpsingh@...nel.org, linux-kernel@...r.kernel.org, martin.lau@...ux.dev,
 memxor@...il.com, realwujing@...il.com, realwujing@...com, sdf@...ichev.me,
 song@...nel.org, yonghong.song@...ux.dev, yuanql9@...natelecom.cn
Subject:
 Re: [PATCH v2] bpf/verifier: implement slab cache for verifier state list

On 2026/1/17 11:26, Qiliang Yuan wrote:
> On Fri, 16 Jan 2026 22:50:36 +0800, Menglong Dong <menglong.dong@...ux.dev> wrote:
> > On 2026/1/16 21:29, Qiliang Yuan wrote:
> > > The BPF verifier's state exploration logic in is_state_visited() frequently
> > > allocates and deallocates 'struct bpf_verifier_state_list' nodes. Currently,
> > > these allocations use generic kzalloc(), which leads to significant memory
> > > management overhead and page faults during high-complexity verification,
> > > especially in multi-core parallel scenarios.
> > > 
> > > This patch introduces a dedicated 'bpf_verifier_state_list' slab cache to
> > > optimize these allocations, providing better speed, reduced fragmentation,
> > > and improved cache locality. All allocation and deallocation paths are
> > > migrated to use kmem_cache_zalloc() and kmem_cache_free().
> > > 
> > > Performance evaluation using a stress test (1000 conditional branches)
> > > executed in parallel on 32 CPU cores for 60 seconds shows significant
> > > improvements:
> > 
> > This patch is a little mess. First, don't send a new version by replying to
> > your previous version.
> 
> Hi Menglong,
> 
> Congratulations on obtaining your @linux.dev email! It is great to see your
> contribution to the community being recognized.
> 
> The core logic remains unchanged. Following suggestions from several
> reviewers, I've added the perf benchmark data and sent this as a reply to the
> previous thread to keep the context and review history easier to track.

You can put the link of your previous version to the change log. I
suspect the patchwork can't even detect this new version if you send
it as a reply.

> 
> > > Metric              | Baseline      | Patched       | Delta (%)
> > > --------------------|---------------|---------------|----------
> > > Page Faults         | 12,377,064    | 8,534,044     | -31.05%
> > > IPC                 | 1.17          | 1.22          | +4.27%
> > > CPU Cycles          | 1,795.37B     | 1,700.33B     | -5.29%
> > > Instructions        | 2,102.99B     | 2,074.27B     | -1.37%
> > 
> > And the test case is odd too. What performance improvement do we
> > get from this testing result? You run the veristat infinitely and record the
> > performance with perf for 60s, so what can we get? Shouldn't you
> > run the veristat for certain times and see the performance, such as
> > the duration or the CPU cycles?
> 
> Following suggestions from several reviewers, I aimed to provide perf
> benchmark data for comparison. However, existing veristat tests do not
> frequently trigger the specific state list allocation paths I modified. This
> is why I constructed a dedicated stress test and included the code in the
> commit message to clearly demonstrate the performance gains.
> 
> > You optimize the verifier to reduce the verifying duration in your case,
> > which seems to be a complex BPF program and consume much time
> > in verifier. So what performance increasing do you get in your case?
> 
> The performance gains are primarily seen in the 31.05% reduction in Page Faults
> and the 4.27% increase in IPC. These metrics indicate that moving to a
> dedicated slab cache significantly reduces memory management overhead and
> improves instruction throughput. Specifically, the reduction in CPU cycles
> (-5.29%) confirms that the verifier spends less time on internal allocation
> logic, which is crucial for complex BPF programs that involve deep state
> exploration.

You introduce the slab cache to speed up the verifier, so I think we need
a comparison, such as how long a complex BPF program can take in
the verifier. If it is no more than 1ms, then I think it doesn't make much
sense to obtain the 5% speeding up. After all, it's not a BPF runtime
overhead.

> 
> > > Suggested-by: Kumar Kartikeya Dwivedi <memxor@...il.com>
> > > Suggested-by: Eduard Zingerman <eddyz87@...il.com>
> > 
> > You don't need to add all the reviewers here, unless big changes is
> > made.
> 
> That makes sense, thanks for the advice. I'll refine this in the next version.
> 
> > > Signed-off-by: Qiliang Yuan <realwujing@...il.com>
> > > ---
> > > On Mon, 2026-01-12 at 19:15 +0100, Kumar Kartikeya Dwivedi wrote:
> > > > Did you run any numbers on whether this improves verification performance?
> > > > Without any compelling evidence, I would leave things as-is.
> > 
> > This is not how we write change logs, please see how other people
> > do.
> 
> Actually, the content following the 'Signed-off-by' line and the '---' marker 
> is specifically designed to be ignored by 'git am' when the patch is applied. 
> Only the text above the triple-dash is preserved as the permanent commit 
> message. I intentionally placed the responses to previous reviewer comments 
> in that section so that you could see the context and history during review 
> without those discussions being permanently recorded in the git log. You can 
> verify this behavior by testing 'git am' on a similar patch.
> 
> It's for this very reason that I decided to include the reply to reviewers 
> directly within the v2 patch.

I think we don't do it this way, and it makes the patch look a mess. You can
reply directly in the mail.

>