lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK7LNASyfxY9RJU+pvEdyd6yuB=r4C9xcvBBTLokXe_xkhM8RA@mail.gmail.com>
Date: Sun, 27 Oct 2024 18:21:14 +0900
From: Masahiro Yamada <masahiroy@...nel.org>
To: Hari Bathini <hbathini@...ux.ibm.com>
Cc: linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>, bpf@...r.kernel.org, 
	linux-trace-kernel@...r.kernel.org, linux-kbuild@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Michael Ellerman <mpe@...erman.id.au>, 
	"Naveen N. Rao" <naveen@...nel.org>, Mark Rutland <mark.rutland@....com>, 
	Daniel Borkmann <daniel@...earbox.net>, Nicholas Piggin <npiggin@...il.com>, 
	Alexei Starovoitov <ast@...nel.org>, Steven Rostedt <rostedt@...dmis.org>, 
	Masami Hiramatsu <mhiramat@...nel.org>, Andrii Nakryiko <andrii@...nel.org>, 
	Christophe Leroy <christophe.leroy@...roup.eu>, Vishal Chourasia <vishalc@...ux.ibm.com>, 
	Mahesh J Salgaonkar <mahesh@...ux.ibm.com>
Subject: Re: [PATCH v6 12/17] powerpc64/ftrace: Move ftrace sequence out of line

On Sat, Oct 19, 2024 at 2:38 AM Hari Bathini <hbathini@...ux.ibm.com> wrote:
>
> From: Naveen N Rao <naveen@...nel.org>
>
> Function profile sequence on powerpc includes two instructions at the
> beginning of each function:
>         mflr    r0
>         bl      ftrace_caller
>
> The call to ftrace_caller() gets nop'ed out during kernel boot and is
> patched in when ftrace is enabled.
>
> Given the sequence, we cannot return from ftrace_caller with 'blr' as we
> need to keep LR and r0 intact. This results in link stack (return
> address predictor) imbalance when ftrace is enabled. To address that, we
> would like to use a three instruction sequence:
>         mflr    r0
>         bl      ftrace_caller
>         mtlr    r0
>
> Further more, to support DYNAMIC_FTRACE_WITH_CALL_OPS, we need to
> reserve two instruction slots before the function. This results in a
> total of five instruction slots to be reserved for ftrace use on each
> function that is traced.
>
> Move the function profile sequence out-of-line to minimize its impact.
> To do this, we reserve a single nop at function entry using
> -fpatchable-function-entry=1 and add a pass on vmlinux.o to determine
> the total number of functions that can be traced. This is then used to
> generate a .S file reserving the appropriate amount of space for use as
> ftrace stubs, which is built and linked into vmlinux.
>
> On bootup, the stub space is split into separate stubs per function and
> populated with the proper instruction sequence. A pointer to the
> associated stub is maintained in dyn_arch_ftrace.
>
> For modules, space for ftrace stubs is reserved from the generic module
> stub space.
>
> This is restricted to and enabled by default only on 64-bit powerpc,
> though there are some changes to accommodate 32-bit powerpc. This is
> done so that 32-bit powerpc could choose to opt into this based on
> further tests and benchmarks.
>
> As an example, after this patch, kernel functions will have a single nop
> at function entry:
> <kernel_clone>:
>         addis   r2,r12,467
>         addi    r2,r2,-16028
>         nop
>         mfocrf  r11,8
>         ...
>
> When ftrace is enabled, the nop is converted to an unconditional branch
> to the stub associated with that function:
> <kernel_clone>:
>         addis   r2,r12,467
>         addi    r2,r2,-16028
>         b       ftrace_ool_stub_text_end+0x11b28
>         mfocrf  r11,8
>         ...
>
> The associated stub:
> <ftrace_ool_stub_text_end+0x11b28>:
>         mflr    r0
>         bl      ftrace_caller
>         mtlr    r0
>         b       kernel_clone+0xc
>         ...
>
> This change showed an improvement of ~10% in null_syscall benchmark on a
> Power 10 system with ftrace enabled.
>
> Signed-off-by: Naveen N Rao <naveen@...nel.org>
> Signed-off-by: Hari Bathini <hbathini@...ux.ibm.com>
> ---

> diff --git a/arch/powerpc/tools/Makefile b/arch/powerpc/tools/Makefile
> new file mode 100644
> index 000000000000..d2e7ecd5f46f
> --- /dev/null
> +++ b/arch/powerpc/tools/Makefile
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +quiet_cmd_gen_ftrace_ool_stubs = GEN     $@
> +      cmd_gen_ftrace_ool_stubs = $< "$(CONFIG_64BIT)" "$(OBJDUMP)" vmlinux.o $@
> +
> +$(obj)/vmlinux.arch.S: $(src)/ftrace-gen-ool-stubs.sh vmlinux.o FORCE
> +       $(call if_changed,gen_ftrace_ool_stubs)
> +
> +targets += vmlinux.arch.S


Makefile looks good to me.


> diff --git a/arch/powerpc/tools/ftrace-gen-ool-stubs.sh b/arch/powerpc/tools/ftrace-gen-ool-stubs.sh
> new file mode 100755
> index 000000000000..96e1ca5803e4
> --- /dev/null
> +++ b/arch/powerpc/tools/ftrace-gen-ool-stubs.sh
> @@ -0,0 +1,41 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +# Error out on error
> +set -e
> +
> +is_64bit="$1"
> +objdump="$2"
> +vmlinux_o="$3"
> +arch_vmlinux_S="$4"
> +
> +RELOCATION=R_PPC64_ADDR64
> +if [ -z "$is_64bit" ]; then
> +       RELOCATION=R_PPC_ADDR32
> +fi
> +
> +num_ool_stubs_text=$($objdump -r -j __patchable_function_entries "$vmlinux_o" |
> +                    grep -v ".init.text" | grep -c "$RELOCATION")
> +num_ool_stubs_inittext=$($objdump -r -j __patchable_function_entries "$vmlinux_o" |
> +                        grep ".init.text" | grep -c "$RELOCATION")
> +
> +cat > "$arch_vmlinux_S" <<EOF
> +#include <asm/asm-offsets.h>
> +#include <linux/linkage.h>
> +
> +.pushsection .tramp.ftrace.text,"aw"
> +SYM_DATA(ftrace_ool_stub_text_end_count, .long $num_ool_stubs_text)
> +
> +SYM_CODE_START(ftrace_ool_stub_text_end)
> +       .space $num_ool_stubs_text * FTRACE_OOL_STUB_SIZE
> +SYM_CODE_END(ftrace_ool_stub_text_end)
> +.popsection
> +
> +.pushsection .tramp.ftrace.init,"aw"
> +SYM_DATA(ftrace_ool_stub_inittext_count, .long $num_ool_stubs_inittext)
> +
> +SYM_CODE_START(ftrace_ool_stub_inittext)
> +       .space $num_ool_stubs_inittext * FTRACE_OOL_STUB_SIZE


To avoid the warning mention in another thread,
it is better to avoid zero .space.





> +SYM_CODE_END(ftrace_ool_stub_inittext)
> +.popsection
> +EOF
> --
> 2.47.0
>


-- 
Best Regards
Masahiro Yamada

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ