linux-kernel - Re: [PATCH v3] bpf/verifier: optimize precision backtracking by skipping precise bits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <a714ee96d0ad96bbc9d51037616e5c4e2790a8ec.camel@gmail.com>
Date: Mon, 19 Jan 2026 10:43:37 -0800
From: Eduard Zingerman <eddyz87@...il.com>
To: Qiliang Yuan <realwujing@...il.com>
Cc: andrii@...nel.org, ast@...nel.org, bpf@...r.kernel.org,
 daniel@...earbox.net, 	haoluo@...gle.com, jolsa@...nel.org,
 kpsingh@...nel.org, 	linux-kernel@...r.kernel.org, martin.lau@...ux.dev,
 sdf@...ichev.me, 	song@...nel.org, yonghong.song@...ux.dev,
 yuanql9@...natelecom.cn
Subject: Re: [PATCH v3] bpf/verifier: optimize precision backtracking by
 skipping precise bits

On Sat, 2026-01-17 at 18:09 +0800, Qiliang Yuan wrote:

Hi Qiliang,

> 2. System-wide saturation profiling (32 cores):
>    # Start perf in background
>    sudo perf stat -a -- sleep 60 &
>    # Start 32 parallel loops of veristat
>    for i in {1..32}; do (while true; do ./veristat backtrack_stress.bpf.o > /dev/null; done &); done

I'm not sure system-wide testing is helpful in this context.
I'd suggest collecting stats for a single process, e.g. as follows:

  perf stat -B --all-kernel -r10 -- ./veristat -q pyperf180.bpf.o

(Note: pyperf180 is a reasonably complex test for many purposes).
And then collecting profiling data:

  perf record -o <somewhere-where-mmap-is-possible> \
              --all-kernel --call-graph=dwarf --vmlinux=<path-to-vmlinux> \
	      -- ./veristat -q pyperf180.bpf.o

And then inspecting the profiling data using `perf report`.
What I see in stats corroborates with Yonghong's findings:

  W/o the patch:
            ...
          22293282      branch-misses                    #      2.8 %  branch_miss_rate         ( +-  1.25% )  (50.10%)
         594485451      branches                         #   1012.5 M/sec  branch_frequency     ( +-  1.68% )  (66.67%)
        1544503960      cpu-cycles                       #      2.6 GHz  cycles_frequency       ( +-  0.18% )  (67.02%)
        3305212994      instructions                     #      2.1 instructions  insn_per_cycle  ( +-  2.04% )  (67.11%)
         587496908      stalled-cycles-frontend          #     0.38 frontend_cycles_idle        ( +-  1.21% )  (66.39%)

           0.60033 +- 0.00173 seconds time elapsed  ( +-  0.29% )

  With the patch
            ...
          22397789      branch-misses                    #      2.8 %  branch_miss_rate         ( +-  1.27% )  (50.37%)
         596289399      branches                         #   1004.8 M/sec  branch_frequency     ( +-  1.59% )  (66.95%)
        1546060617      cpu-cycles                       #      2.6 GHz  cycles_frequency       ( +-  0.16% )  (66.67%)
        3325745352      instructions                     #      2.2 instructions  insn_per_cycle  ( +-  1.76% )  (66.61%)
         588040713      stalled-cycles-frontend          #     0.38 frontend_cycles_idle        ( +-  1.23% )  (66.48%)

           0.60697 +- 0.00201 seconds time elapsed  ( +-  0.33% )

So, I'd suggest shelving this change for now.

If you take a look at the profiling data, you'd notice that low
hanging fruit is actually improving bpf_patch_insn_data(),
It takes ~40% of time, at-least for this program.
This was actually discussed a very long time ago [1].
If you are interested in speeding up verifier,
maybe consider taking a look?

Best regards,
Eduard Zingerman.

[1] https://lore.kernel.org/bpf/CAEf4BzY_E8MSL4mD0UPuuiDcbJhh9e2xQo2=5w+ppRWWiYSGvQ@mail.gmail.com/