lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260106215327.GA1957425@ax162>
Date: Tue, 6 Jan 2026 14:53:27 -0700
From: Nathan Chancellor <nathan@...nel.org>
To: Ihor Solodrai <ihor.solodrai@...ux.dev>
Cc: Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Andrii Nakryiko <andrii@...nel.org>,
	Martin KaFai Lau <martin.lau@...ux.dev>,
	Eduard Zingerman <eddyz87@...il.com>,
	Yonghong Song <yonghong.song@...ux.dev>, bpf@...r.kernel.org,
	linux-kernel@...r.kernel.org, llvm@...ts.linux.dev
Subject: Re: [PATCH bpf-next] scripts/gen-btf.sh: Disable LTO when generating
 initial .o file

On Mon, Jan 05, 2026 at 05:06:49PM -0800, Ihor Solodrai wrote:
> I got curious and did a little experiment. Basically, I ran perf stat
> on this part of gen-btf.sh:
> 
> 	echo "" | ${CC} ${CLANG_FLAGS} ${KBUILD_CFLAGS} -c -x c -o ${btf_data} -
> 	${OBJCOPY} --add-section .BTF=${ELF_FILE}.BTF \
> 		--set-section-flags .BTF=alloc,readonly ${btf_data}
> 	${OBJCOPY} --only-section=.BTF --strip-all ${btf_data}
> 
> Replacing ${CC} command with:
> 
> 	${OBJCOPY} --strip-all "${ELF_FILE}" ${btf_data} 2>/dev/null
> 
> for comparison.
> 
> TL;DR is that using ${CC} is:
>   * about 1.5x faster than GNU objcopy --strip-all .tmp_vmlinux1
>   * about 16x (!) faster than llvm-objcopy --strip-all .tmp_vmlinux1
> 
> With obvious caveats that this is a particular machine (Threadripper
> PRO 3975WX), toolchain etc:
>   * clang version 21.1.7
>   * gcc (GCC) 15.2.1 20251211
> 
> This is bpf-next (a069190b590e) with BPF CI-like kconfig.

Oof, that difference between GNU and LLVM's objcopy implementations...
At the same time, it was only a little over a second for llvm-objcopy.
Maybe that gets worse if more is built into the kernel to the point
where it is untenable but maybe it is worth the reduced complexity? That
said, my patch is pretty simple (and a follow up for KBUILD_CPPFLAGS if
needed would be equally simple), your testing demonstrates that there
is some performance improvement, and I cannot imagine there being any
other bugs of this nature in this area going forward. I have no real
strong opinion, I just need my builds to finish :)

Cheers,
Nathan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ