[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAK7LNAQV+B=Jx1o3j3YkVL6CuTz5uPUnS+340KGA7aKs2eLxXw@mail.gmail.com>
Date: Tue, 27 Aug 2024 18:12:22 +0900
From: Masahiro Yamada <masahiroy@...nel.org>
To: Naveen N Rao <naveen@...nel.org>
Cc: linuxppc-dev@...ts.ozlabs.org, linux-trace-kernel@...r.kernel.org,
bpf@...r.kernel.org, linux-kbuild@...r.kernel.org,
linux-kernel@...r.kernel.org, Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>, Christophe Leroy <christophe.leroy@...roup.eu>,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>,
Mark Rutland <mark.rutland@....com>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>,
Hari Bathini <hbathini@...ux.ibm.com>, Mahesh Salgaonkar <mahesh@...ux.ibm.com>,
Vishal Chourasia <vishalc@...ux.ibm.com>
Subject: Re: [RFC PATCH v4 12/17] powerpc64/ftrace: Move ftrace sequence out
of line
On Sun, Jul 14, 2024 at 5:29 PM Naveen N Rao <naveen@...nel.org> wrote:
>
> Function profile sequence on powerpc includes two instructions at the
> beginning of each function:
> mflr r0
> bl ftrace_caller
>
> The call to ftrace_caller() gets nop'ed out during kernel boot and is
> patched in when ftrace is enabled.
>
> Given the sequence, we cannot return from ftrace_caller with 'blr' as we
> need to keep LR and r0 intact. This results in link stack (return
> address predictor) imbalance when ftrace is enabled. To address that, we
> would like to use a three instruction sequence:
> mflr r0
> bl ftrace_caller
> mtlr r0
>
> Further more, to support DYNAMIC_FTRACE_WITH_CALL_OPS, we need to
> reserve two instruction slots before the function. This results in a
> total of five instruction slots to be reserved for ftrace use on each
> function that is traced.
>
> Move the function profile sequence out-of-line to minimize its impact.
> To do this, we reserve a single nop at function entry using
> -fpatchable-function-entry=1 and add a pass on vmlinux.o to determine
> the total number of functions that can be traced. This is then used to
> generate a .S file reserving the appropriate amount of space for use as
> ftrace stubs, which is built and linked into vmlinux.
>
> On bootup, the stub space is split into separate stubs per function and
> populated with the proper instruction sequence. A pointer to the
> associated stub is maintained in dyn_arch_ftrace.
>
> For modules, space for ftrace stubs is reserved from the generic module
> stub space.
>
> This is restricted to and enabled by default only on 64-bit powerpc,
> though there are some changes to accommodate 32-bit powerpc. This is
> done so that 32-bit powerpc could choose to opt into this based on
> further tests and benchmarks.
>
> As an example, after this patch, kernel functions will have a single nop
> at function entry:
> <kernel_clone>:
> addis r2,r12,467
> addi r2,r2,-16028
> nop
> mfocrf r11,8
> ...
>
> When ftrace is enabled, the nop is converted to an unconditional branch
> to the stub associated with that function:
> <kernel_clone>:
> addis r2,r12,467
> addi r2,r2,-16028
> b ftrace_ool_stub_text_end+0x11b28
> mfocrf r11,8
> ...
>
> The associated stub:
> <ftrace_ool_stub_text_end+0x11b28>:
> mflr r0
> bl ftrace_caller
> mtlr r0
> b kernel_clone+0xc
> ...
>
> Signed-off-by: Naveen N Rao <naveen@...nel.org>
> diff --git a/arch/powerpc/tools/Makefile b/arch/powerpc/tools/Makefile
> new file mode 100644
> index 000000000000..31dd3151c272
> --- /dev/null
> +++ b/arch/powerpc/tools/Makefile
> @@ -0,0 +1,10 @@
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +quiet_cmd_gen_ftrace_ool_stubs = FTRACE $@
This is not "FTRACE".
"GEN" or something like that.
> + cmd_gen_ftrace_ool_stubs = $< $(objtree)/vmlinux.o $@
> +
> +targets += .arch.vmlinux.o
> +.arch.vmlinux.o: $(srctree)/arch/powerpc/tools/ftrace-gen-ool-stubs.sh $(objtree)/vmlinux.o FORCE
> + $(call if_changed,gen_ftrace_ool_stubs)
> +
> +clean-files += $(objtree)/.arch.vmlinux.S $(objtree)/.arch.vmlinux.o
This is wrong. $(objtree) is always '.'
It will attempt to clean up:
arch/powerpc/tools/.arch.vmlinux.S
arch/powerpc/tools/.arch.vmlinux.o
You must not create the intermediate file,
.arch.vmlinux.S at the top directory because
this build step is pretty much PowerPC-specific.
Rather, I'd recommend to create *.S and *.o in
arch/powerpc/tools/:
arch/powerpc/tools/vmlinux.S
arch/powerpc/tools/vmlinux.o
When you submit the next version, please run 'make clean'
and confirm that any powerpc-specific build artifacts
not being left-over.
> diff --git a/arch/powerpc/tools/ftrace-gen-ool-stubs.sh b/arch/powerpc/tools/ftrace-gen-ool-stubs.sh
> new file mode 100755
> index 000000000000..0b85cd5262ff
> --- /dev/null
> +++ b/arch/powerpc/tools/ftrace-gen-ool-stubs.sh
> @@ -0,0 +1,48 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +# Error out on error
> +set -e
> +
> +is_enabled() {
> + grep -q "^$1=y" include/config/auto.conf
> +}
> +
> +vmlinux_o=${1}
> +arch_vmlinux_o=${2}
> +arch_vmlinux_S=$(dirname ${arch_vmlinux_o})/$(basename ${arch_vmlinux_o} .o).S
> +
> +RELOCATION=R_PPC64_ADDR64
> +if is_enabled CONFIG_PPC32; then
> + RELOCATION=R_PPC_ADDR32
> +fi
> +
> +num_ool_stubs_text=$(${CROSS_COMPILE}objdump -r -j __patchable_function_entries ${vmlinux_o} |
> + grep -v ".init.text" | grep "${RELOCATION}" | wc -l)
> +num_ool_stubs_inittext=$(${CROSS_COMPILE}objdump -r -j __patchable_function_entries ${vmlinux_o} |
> + grep ".init.text" | grep "${RELOCATION}" | wc -l)
> +
> +cat > ${arch_vmlinux_S} <<EOF
> +#include <asm/asm-offsets.h>
> +#include <linux/linkage.h>
> +
> +.pushsection .tramp.ftrace.text,"aw"
> +SYM_DATA(ftrace_ool_stub_text_end_count, .long ${num_ool_stubs_text})
> +
> +SYM_CODE_START(ftrace_ool_stub_text_end)
> + .space ${num_ool_stubs_text} * FTRACE_OOL_STUB_SIZE
> +SYM_CODE_END(ftrace_ool_stub_text_end)
> +.popsection
> +
> +.pushsection .tramp.ftrace.init,"aw"
> +SYM_DATA(ftrace_ool_stub_inittext_count, .long ${num_ool_stubs_inittext})
> +
> +SYM_CODE_START(ftrace_ool_stub_inittext)
> + .space ${num_ool_stubs_inittext} * FTRACE_OOL_STUB_SIZE
> +SYM_CODE_END(ftrace_ool_stub_inittext)
> +.popsection
> +EOF
> +
> +${CC} ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS} \
> + ${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \
> + -c -o ${arch_vmlinux_o} ${arch_vmlinux_S}
Please do not compile this within a shell script.
scripts/Makefile.build provides rule_as_o_S to do this.
[1] vmlinux.o --> arch/powerpc/tools/vmlinux.S
[2] arch/powerpc/tools/vmlinux.S --> arch/powerpc/tools/vmlinux.o
Please split these in separate build rules.
> --
> 2.45.2
>
--
Best Regards
Masahiro Yamada
Powered by blists - more mailing lists