[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <62f06687-a20d-4f55-b22d-52af707b528a@yadro.com>
Date: Mon, 2 Dec 2024 10:58:51 +0300
From: Evgenii Shatokhin <e.shatokhin@...ro.com>
To: Andy Chiu <andybnac@...il.com>
CC: Justin Stitt <justinstitt@...gle.com>, Bill Wendling <morbo@...gle.com>,
Nick Desaulniers <ndesaulniers@...gle.com>, Nathan Chancellor
<nathan@...nel.org>, Albert Ou <aou@...s.berkeley.edu>, Palmer Dabbelt
<palmer@...belt.com>, Paul Walmsley <paul.walmsley@...ive.com>,
<linux-kernel@...r.kernel.org>, <linux-riscv@...ts.infradead.org>,
<llvm@...ts.linux.dev>, <bjorn@...osinc.com>, <puranjay12@...il.com>,
<alexghiti@...osinc.com>, <yongxuan.wang@...ive.com>,
<greentime.hu@...ive.com>, <nick.hu@...ive.com>, <nylon.chen@...ive.com>,
<tommy.wu@...ive.com>, <eric.lin@...ive.com>, <viccent.chen@...ive.com>,
<zong.li@...ive.com>, <samuel.holland@...ive.com>, <linux@...ro.com>
Subject: Re: [PATCH v3 0/7] riscv: ftrace: atmoic patching and preempt
improvements
Hi,
On 27.11.2024 20:29, Andy Chiu wrote:
> This series makes atmoic code patching possible in riscv ftrace. A
> direct benefit of this is that we can get rid of stop_machine() when
> patching function entries. This also makes it possible to run ftrace
> with full kernel preemption. Before this series, the kernel initializes
> patchable function entries to NOP4 + NOP4. To start tracing, it updates
> entries to AUIPC + JALR while holding other cores in stop_machine.
> stop_machine() is required because it is impossible to update 2
> instructions, and be seen atomically. And preemption must have to be
> prevented, as kernel preemption allows process to be scheduled out while
> executing on one of these instruction pairs.
>
> This series addresses the problem by initializing the first NOP4 to
> AUIPC. So, atmoic patching is possible because the kernel only has to
> update one instruction. As long as the instruction is naturally aligned,
> then it is expected to be updated atomically.
>
> However, the address range of the ftrace trampoline is limited to +-2K
> from ftrace_caller after appplying this series. This issue is expected
> to be solved by Puranjay's CALL_OPS, where it adds 8B naturally align
> data in front of pacthable functions and can use it to direct execution
> out to any custom trampolines.
>
> The series is composed by three parts. The first part cleans up the
> existing issues when the kernel is compiled with clang.The second part
> modifies the ftrace code patching mechanism (2-4) as mentioned above.
> Then prepare ftrace to be able to run with kernel preemption (5,6)
>
> An ongoing fix:
>
> Since there is no room for marking *kernel_text_address as notrace[1] at
> source code level, there is a significant performance regression when
> using function_graph with TRACE_IRQFLAGS enabled. There can be as much as
> 8 graph handler being called in each function-entry. The current
> workaround requires us echo "*kernel_text_address" into
> set_ftrace_notrace before starting the trace. However, we observed that
> the kernel still enables the patch site in some cases even with
> *kernel_text_address properly added in the file While the root cause is
> still under investagtion, we consider that it should not be the reason
> for holding back the code patching, in order to unblock the call_ops
> part.
>
> [1]: https://lore.kernel.org/linux-riscv/20240613093233.0b349ed0@rorschach.local.home/
>
> Changes in v3:
> - Add a fix tag for patch 1
> - Add a data fence before sending out remote fence.i (6)
> - Link to v2: https://lore.kernel.org/all/20240628-dev-andyc-dyn-ftrace-v4-v2-0-1e5f4cb1f049@sifive.com/
>
> Changes in v2:
> - Drop patch 1 as it is merged through fixes.
> - Drop patch 2, which converts kernel_text_address into notrace. As
> users can prevent tracing it by configuring the tracefs.
> - Use a more generic way in kconfig to align functions.
> - Link to v1: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-0-1a538e12c01e@sifive.com
>
>
> Andy Chiu (7):
> riscv: ftrace: support fastcc in Clang for WITH_ARGS
> riscv: ftrace: align patchable functions to 4 Byte boundary
> riscv: ftrace: prepare ftrace for atomic code patching
> riscv: ftrace: do not use stop_machine to update code
> riscv: vector: Support calling schedule() for preemptible Vector
> riscv: add a data fence for CMODX in the kernel mode
> riscv: ftrace: support PREEMPT
>
> arch/riscv/Kconfig | 4 +-
> arch/riscv/include/asm/ftrace.h | 11 +++
> arch/riscv/include/asm/processor.h | 5 ++
> arch/riscv/include/asm/vector.h | 22 ++++-
> arch/riscv/kernel/asm-offsets.c | 7 ++
> arch/riscv/kernel/ftrace.c | 133 ++++++++++++-----------------
> arch/riscv/kernel/mcount-dyn.S | 25 ++++--
> arch/riscv/mm/cacheflush.c | 15 +++-
> 8 files changed, 135 insertions(+), 87 deletions(-)
> ---
> base-commit: 0eb512779d642b21ced83778287a0f7a3ca8f2a1
> --
> 2.39.3 (Apple Git-145)
I have tested this series in a QEMU VM (-machine virt) with the
preemptible kernels, CONFIG_PREEMPT=y.
No issues have been revealed so far.
One of the kernels was built by GCC 13.2.0 (with the patch for minimum
alignment added on top of it), the other - with LLVM 17.0.6.
In both cases, the basic boottime tests for Ftrace passed.
Switching tracers between nop, function, function_graph and blk in a
loop in parallel with stress-ng and with network load for several hours
did not reveal any problems either. Kernel crashes happened in this
scenario a year ago, but now it runs clean, good!
Function redirection via Ftrace seems to work OK too.
The size of .text section increased slightly after this series, by 0.35%
- 0.38%, probably because of function alignment:
* 12075 KB => 12121 KB (GCC)
* 12167 KB => 12209 KB (LLVM/clang)
Not sure, how to test the vector-related part though, "[PATCH v3 5/7]
riscv: vector: Support calling schedule() for preemptible Vector"
For all other patches in the series:
Tested-by: Evgenii Shatokhin <e.shatokhin@...ro.com>
Regards,
Evgenii
Powered by blists - more mailing lists