[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGUO8L7oXpvEpvZo@pie.lan>
Date: Wed, 2 Jul 2025 10:50:24 +0000
From: Yao Zi <ziyao@...root.org>
To: Alexandre Ghiti <alex@...ti.fr>, Andy Chiu <andybnac@...il.com>,
alexghiti@...osinc.com, palmer@...belt.com,
Andy Chiu <andy.chiu@...ive.com>,
Björn Töpel <bjorn@...osinc.com>,
Mark Rutland <mark.rutland@....com>, puranjay12@...il.com,
paul.walmsley@...ive.com, greentime.hu@...ive.com,
nick.hu@...ive.com, nylon.chen@...ive.com, eric.lin@...ive.com,
vicent.chen@...ive.com, zong.li@...ive.com,
yongxuan.wang@...ive.com, samuel.holland@...ive.com,
olivia.chu@...ive.com, c2232430@...il.com
Cc: Han Gao <rabenda.cn@...il.com>, Vivian Wang <wangruikang@...as.ac.cn>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
regressions@...ts.linux.dev, linux-riscv@...ts.infradead.org
Subject: Re: [REGRESSION] Random oops on SG2042 with Linux 6.16-rc and
dynamic ftrace
On Tue, Jul 01, 2025 at 02:27:32PM +0200, Alexandre Ghiti wrote:
> Hi Yao,
>
> On 7/1/25 08:41, Yao Zi wrote:
> > Linux v6.16 built with dynamic ftrace randomly oops or triggers
> > ftrace_bug() on Sophgo SG2042 when booting systemd-based userspace,
...
> > Not sure either reverting the commits or fixing them up is a better
> > idea, but anyway the fatal first issue shouidn't go into the stable
> > release.
>
> Let's fix this, we were expecting issues with dynamic ftrace :)
>
> So the following diff fixes all the issues you mentioned (not the first
> crash though, I'll let you test and see if it works better, I don't have
> this board):
Thanks for the fix! I've tested it with both QEMU and SG2042, it does
fix the lockdep failures as well as the boot time crash on SG2042. The
boot-time crash is caused by the race so will disappear as long as we
fix the race.
> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> index 4c6c24380cfd9..97ced537aa1e0 100644
> --- a/arch/riscv/kernel/ftrace.c
> +++ b/arch/riscv/kernel/ftrace.c
> @@ -14,6 +14,16 @@
> #include <asm/text-patching.h>
>
> #ifdef CONFIG_DYNAMIC_FTRACE
> +void ftrace_arch_code_modify_prepare(void)
> +{
> + mutex_lock(&text_mutex);
> +}
> +
> +void ftrace_arch_code_modify_post_process(void)
> +{
> + mutex_unlock(&text_mutex);
> +}
> +
> unsigned long ftrace_call_adjust(unsigned long addr)
> {
> if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
> @@ -29,10 +39,8 @@ unsigned long arch_ftrace_get_symaddr(unsigned long
> fentry_ip)
>
> void arch_ftrace_update_code(int command)
> {
> - mutex_lock(&text_mutex);
> command |= FTRACE_MAY_SLEEP;
> ftrace_modify_all_code(command);
> - mutex_unlock(&text_mutex);
> flush_icache_all();
> }
>
> @@ -149,16 +157,17 @@ int ftrace_init_nop(struct module *mod, struct
> dyn_ftrace *rec)
> unsigned int nops[2], offset;
> int ret;
>
> + mutex_lock(&text_mutex);
Besides using the guard API, could we swap the order between
ftrace_rec_set_nop_ops() and calculation of the nops array? This shrinks
the critical region a little.
With or without the change, here's my tag,
Tested-by: Yao Zi <ziyao@...root.org>
and also
Reported-by: Han Gao <rabenda.cn@...il.com>
Reported-by: Vivian Wang <wangruikang@...as.ac.cn>
for their first-hand report of boot-time crash and analysis for the
first lock issue.
Regards,
Yao Zi
> ret = ftrace_rec_set_nop_ops(rec);
> if (ret)
> - return ret;
> + goto end;
>
> offset = (unsigned long) &ftrace_caller - pc;
> nops[0] = to_auipc_t0(offset);
> nops[1] = RISCV_INSN_NOP4;
>
> - mutex_lock(&text_mutex);
> ret = patch_insn_write((void *)pc, nops, 2 * MCOUNT_INSN_SIZE);
> +end:
> mutex_unlock(&text_mutex);
>
> return ret;
>
> Andy is also taking a look, I'll let him confirm the above fix is correct.
>
> Thanks for the thorough report!
>
> Alex
>
>
> >
> > Thanks for your suggestions on the problems.
> >
> > Regards,
> > Yao Zi
> >
> > [1]: https://lore.kernel.org/all/20250407180838.42877-1-andybnac@gmail.com/
> >
> > #regzbot introduced: 881dadf0792c
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@...ts.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
Powered by blists - more mailing lists