[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dd877b46-5327-4f9e-91d0-30a9e863a9cb@ghiti.fr>
Date: Wed, 7 May 2025 16:35:47 +0200
From: Alexandre Ghiti <alex@...ti.fr>
To: Andy Chiu <andybnac@...il.com>
Cc: linux-riscv@...ts.infradead.org, alexghiti@...osinc.com,
palmer@...belt.com, Andy Chiu <andy.chiu@...ive.com>,
Björn Töpel <bjorn@...osinc.com>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
Mark Rutland <mark.rutland@....com>, puranjay12@...il.com,
paul.walmsley@...ive.com, greentime.hu@...ive.com, nick.hu@...ive.com,
nylon.chen@...ive.com, eric.lin@...ive.com, vicent.chen@...ive.com,
zong.li@...ive.com, yongxuan.wang@...ive.com, samuel.holland@...ive.com,
olivia.chu@...ive.com, c2232430@...il.com
Subject: Re: [PATCH v4 05/12] riscv: ftrace: prepare ftrace for atomic code
patching
Hi Andy,
On 07/05/2025 16:18, Andy Chiu wrote:
> On Mon, May 5, 2025 at 10:06 PM Alexandre Ghiti <alex@...ti.fr> wrote:
>> On 23/04/2025 10:22, Alexandre Ghiti wrote:
>>> Hi Andy,
>>>
>>> On 07/04/2025 20:08, Andy Chiu wrote:
>>>> From: Andy Chiu <andy.chiu@...ive.com>
>>>>
>>>> We use an AUIPC+JALR pair to jump into a ftrace trampoline. Since
>>>> instruction fetch can break down to 4 byte at a time, it is impossible
>>>> to update two instructions without a race. In order to mitigate it, we
>>>> initialize the patchable entry to AUIPC + NOP4. Then, the run-time code
>>>> patching can change NOP4 to JALR to eable/disable ftrcae from a
>>>> function. This limits the reach of each ftrace entry to +-2KB displacing
>>>> from ftrace_caller.
>>>>
>>>> Starting from the trampoline, we add a level of indirection for it to
>>>> reach ftrace caller target. Now, it loads the target address from a
>>>> memory location, then perform the jump. This enable the kernel to update
>>>> the target atomically.
>>>>
>>>> The new don't-stop-the-world text patching on change only one RISC-V
>>>> instruction:
>>>>
>>>> | -8: &ftrace_ops of the associated tracer function.
>>>> | <ftrace enable>:
>>>> | 0: auipc t0, hi(ftrace_caller)
>>>> | 4: jalr t0, lo(ftrace_caller)
>>>> |
>>>> | -8: &ftrace_nop_ops
>>>> | <ftrace disable>:
>>>> | 0: auipc t0, hi(ftrace_caller)
>>>> | 4: nop
>>>>
>>>> This means that f+0x0 is fixed, and should not be claimed by ftrace,
>>>> e.g. kprobe should be able to put a probe in f+0x0. Thus, we adjust the
>>>> offset and MCOUNT_INSN_SIZE accordingly.
>>>>
>>>> Co-developed-by: Björn Töpel <bjorn@...osinc.com>
>>>> Signed-off-by: Björn Töpel <bjorn@...osinc.com>
>>>> Signed-off-by: Andy Chiu <andy.chiu@...ive.com>
>>>> ---
>>>> Changelog v4:
>>>> - Include Björn's fix for kprobe
>>>> - Refactor code for better reading (Robbin, Björn)
>>>> - Remove make_call_ra and friedns (Björn)
>>>> - Update comments to match reality (Björn)
>>>> - Drop code defined by !WITH_ARG
>>>> - Add a synchronization point when updating ftrace_call_dest (Björn)
>>>> ---
>>>> arch/riscv/include/asm/ftrace.h | 49 ++++++------
>>>> arch/riscv/kernel/ftrace.c | 130 ++++++++++++++++----------------
>>>> arch/riscv/kernel/mcount-dyn.S | 9 +--
>>>> 3 files changed, 92 insertions(+), 96 deletions(-)
>>>>
>>>> diff --git a/arch/riscv/include/asm/ftrace.h
>>>> b/arch/riscv/include/asm/ftrace.h
>>>> index d8b2138bd9c6..6a5c0a7fb826 100644
>>>> --- a/arch/riscv/include/asm/ftrace.h
>>>> +++ b/arch/riscv/include/asm/ftrace.h
>>>> @@ -20,10 +20,9 @@ extern void *return_address(unsigned int level);
>>>> #define ftrace_return_address(n) return_address(n)
>>>> void _mcount(void);
>>>> -static inline unsigned long ftrace_call_adjust(unsigned long addr)
>>>> -{
>>>> - return addr;
>>>> -}
>>>> +unsigned long ftrace_call_adjust(unsigned long addr);
>>>> +unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip);
>>>> +#define ftrace_get_symaddr(fentry_ip)
>>>> arch_ftrace_get_symaddr(fentry_ip)
>>>> /*
>>>> * Let's do like x86/arm64 and ignore the compat syscalls.
>>>> @@ -57,12 +56,21 @@ struct dyn_arch_ftrace {
>>>> * 2) jalr: setting low-12 offset to ra, jump to ra, and set ra to
>>>> * return address (original pc + 4)
>>>> *
>>>> + * The first 2 instructions for each tracable function is compiled
>>>> to 2 nop
>>>> + * instructions. Then, the kernel initializes the first instruction
>>>> to auipc at
>>>> + * boot time (<ftrace disable>). The second instruction is patched
>>>> to jalr to
>>>> + * start the trace.
>>>> + *
>>>> + *<Image>:
>>>> + * 0: nop
>>>> + * 4: nop
>>>> + *
>>>> *<ftrace enable>:
>>>> - * 0: auipc t0/ra, 0x?
>>>> - * 4: jalr t0/ra, ?(t0/ra)
>>>> + * 0: auipc t0, 0x?
>>>> + * 4: jalr t0, ?(t0)
>>>> *
>>>> *<ftrace disable>:
>>>> - * 0: nop
>>>> + * 0: auipc t0, 0x?
>>>> * 4: nop
>>>> *
>>>> * Dynamic ftrace generates probes to call sites, so we must deal with
>>>> @@ -75,10 +83,9 @@ struct dyn_arch_ftrace {
>>>> #define AUIPC_OFFSET_MASK (0xfffff000)
>>>> #define AUIPC_PAD (0x00001000)
>>>> #define JALR_SHIFT 20
>>>> -#define JALR_RA (0x000080e7)
>>>> -#define AUIPC_RA (0x00000097)
>>>> #define JALR_T0 (0x000282e7)
>>>> #define AUIPC_T0 (0x00000297)
>>>> +#define JALR_RANGE (JALR_SIGN_MASK - 1)
>>>> #define to_jalr_t0(offset) \
>>>> (((offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_T0)
>>>> @@ -96,26 +103,14 @@ do { \
>>>> call[1] = to_jalr_t0(offset); \
>>>> } while (0)
>>>> -#define to_jalr_ra(offset) \
>>>> - (((offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_RA)
>>>> -
>>>> -#define to_auipc_ra(offset) \
>>>> - ((offset & JALR_SIGN_MASK) ? \
>>>> - (((offset & AUIPC_OFFSET_MASK) + AUIPC_PAD) | AUIPC_RA) : \
>>>> - ((offset & AUIPC_OFFSET_MASK) | AUIPC_RA))
>>>> -
>>>> -#define make_call_ra(caller, callee, call) \
>>>> -do { \
>>>> - unsigned int offset = \
>>>> - (unsigned long) (callee) - (unsigned long) (caller); \
>>>> - call[0] = to_auipc_ra(offset); \
>>>> - call[1] = to_jalr_ra(offset); \
>>>> -} while (0)
>>>> -
>>>> /*
>>>> - * Let auipc+jalr be the basic *mcount unit*, so we make it 8 bytes
>>>> here.
>>>> + * Only the jalr insn in the auipc+jalr is patched, so we make it 4
>>>> + * bytes here.
>>>> */
>>>> -#define MCOUNT_INSN_SIZE 8
>>>> +#define MCOUNT_INSN_SIZE 4
>>>> +#define MCOUNT_AUIPC_SIZE 4
>>>> +#define MCOUNT_JALR_SIZE 4
>>>> +#define MCOUNT_NOP4_SIZE 4
>>>> #ifndef __ASSEMBLY__
>>>> struct dyn_ftrace;
>>>> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
>>>> index 1fd10555c580..cf78eef073a0 100644
>>>> --- a/arch/riscv/kernel/ftrace.c
>>>> +++ b/arch/riscv/kernel/ftrace.c
>>>> @@ -8,10 +8,21 @@
>>>> #include <linux/ftrace.h>
>>>> #include <linux/uaccess.h>
>>>> #include <linux/memory.h>
>>>> +#include <linux/irqflags.h>
>>>> #include <linux/stop_machine.h>
>>>> #include <asm/cacheflush.h>
>>>> #include <asm/text-patching.h>
>>>> +unsigned long ftrace_call_adjust(unsigned long addr)
>>>> +{
>>>> + return addr + MCOUNT_AUIPC_SIZE;
>>>> +}
>>>> +
>>>> +unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip)
>>>> +{
>>>> + return fentry_ip - MCOUNT_AUIPC_SIZE;
>>>> +}
>>>> +
>>>
>>> Those functions cause the following errors when building with
>>> !CONFIG_DYNAMIC_FTRACE, but I'm not sure how to fix this:
>>>
>>> ../arch/riscv/kernel/ftrace.c: In function 'ftrace_call_adjust':
>>> ../arch/riscv/kernel/ftrace.c:19:35: error: 'MCOUNT_AUIPC_SIZE'
>>> undeclared (first use in this function)
>>> 19 | return addr + 8 + MCOUNT_AUIPC_SIZE;
>>> | ^~~~~~~~~~~~~~~~~
>>> ../arch/riscv/kernel/ftrace.c:19:35: note: each undeclared identifier
>>> is reported only once for each function it appears in
>>> CC fs/9p/vfs_dir.o
>>> ../arch/riscv/kernel/ftrace.c: In function 'arch_ftrace_get_symaddr':
>>> ../arch/riscv/kernel/ftrace.c:26:28: error: 'MCOUNT_AUIPC_SIZE'
>>> undeclared (first use in this function)
>>> 26 | return fentry_ip - MCOUNT_AUIPC_SIZE;
>>> | ^~~~~~~~~~~~~~~~~
>>> CC drivers/pci/pcie/pme.o
>>> ../arch/riscv/kernel/ftrace.c: In function 'ftrace_call_adjust':
>>> ../arch/riscv/kernel/ftrace.c:22:1: error: control reaches end of
>>> non-void function [-Werror=return-type]
>>> 22 | }
>>> | ^
>>> ../arch/riscv/kernel/ftrace.c: In function 'arch_ftrace_get_symaddr':
>>> ../arch/riscv/kernel/ftrace.c:27:1: error: control reaches end of
>>> non-void function [-Werror=return-type]
>>> 27 | }
>>> | ^
>>> cc1: some warnings being treated as errors
>>> make[5]: *** [../scripts/Makefile.build:203:
>>> arch/riscv/kernel/ftrace.o] Error 1
>>>
>> So I fixed those errors with the following, let me know if that's not
>> correct:
>>
>> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
>> index d65f06bfb4573..4c6c24380cfd9 100644
>> --- a/arch/riscv/kernel/ftrace.c
>> +++ b/arch/riscv/kernel/ftrace.c
>> @@ -13,6 +13,7 @@
>> #include <asm/cacheflush.h>
>> #include <asm/text-patching.h>
>>
>> +#ifdef CONFIG_DYNAMIC_FTRACE
>> unsigned long ftrace_call_adjust(unsigned long addr)
>> {
>> if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
>> @@ -26,7 +27,6 @@ unsigned long arch_ftrace_get_symaddr(unsigned long
>> fentry_ip)
>> return fentry_ip - MCOUNT_AUIPC_SIZE;
>> }
>>
>> -#ifdef CONFIG_DYNAMIC_FTRACE
>> void arch_ftrace_update_code(int command)
>> {
>> mutex_lock(&text_mutex);
>> @@ -191,7 +191,12 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>> return 0;
>> }
>>
>> -#endif
>> +#else /* CONFIG_DYNAMIC_FTRACE */
>> +unsigned long ftrace_call_adjust(unsigned long addr)
>> +{
>> + return addr;
>> +}
>> +#endif /* CONFIG_DYNAMIC_FTRACE */
>>
>> #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
>>
> Hi Alex,
>
> Yes, this is valid, thanks for noticing and fixing the error. I would
> personally prefer leaving the #else /* CONFIG_DYNAMIC_FTRACE */ part
> in ftrace.h, but it can also come later as a refactor. Or, I can
> respin the series, with fixes on this and the previous patch, along
> with some typos and variable declarations that Robin mentioned.
If you can respin a new patchset soon, that's nice, in the meantime I
keep this version + my fixes (without Robbin remarks though) in my
for-next branch as we definitely want this in 6.16 :)
So it's up to you!
Thanks,
Alex
>
> Thanks,
> Andy
Powered by blists - more mailing lists