[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJF2gTQ6U1vH79Mu53eQ-GVaFx36C-hEt9Qf6=_vAkHfmgFh1Q@mail.gmail.com>
Date: Tue, 7 Feb 2023 11:57:06 +0800
From: Guo Ren <guoren@...nel.org>
To: Mark Rutland <mark.rutland@....com>
Cc: Evgenii Shatokhin <e.shatokhin@...ro.com>, suagrfillet@...il.com,
andy.chiu@...ive.com, linux-riscv@...ts.infradead.org,
linux-kernel@...r.kernel.org, Guo Ren <guoren@...ux.alibaba.com>,
anup@...infault.org, paul.walmsley@...ive.com, palmer@...belt.com,
conor.dooley@...rochip.com, heiko@...ech.de, rostedt@...dmis.org,
mhiramat@...nel.org, jolsa@...hat.com, bp@...e.de,
jpoimboe@...nel.org, linux@...ro.com
Subject: Re: [PATCH -next V7 0/7] riscv: Optimize function trace
On Mon, Feb 6, 2023 at 5:56 PM Mark Rutland <mark.rutland@....com> wrote:
>
> On Sat, Feb 04, 2023 at 02:40:52PM +0800, Guo Ren wrote:
> > On Mon, Jan 16, 2023 at 11:02 PM Evgenii Shatokhin
> > <e.shatokhin@...ro.com> wrote:
> > >
> > > Hi,
> > >
> > > On 12.01.2023 12:05, guoren@...nel.org wrote:
> > > > From: Guo Ren <guoren@...ux.alibaba.com>
> > > >
> > > > The previous ftrace detour implementation fc76b8b8011 ("riscv: Using
> > > > PATCHABLE_FUNCTION_ENTRY instead of MCOUNT") contain three problems.
> > > >
> > > > - The most horrible bug is preemption panic which found by Andy [1].
> > > > Let's disable preemption for ftrace first, and Andy could continue
> > > > the ftrace preemption work.
> > >
> > > It seems, the patches #2-#7 of this series do not require "riscv:
> > > ftrace: Fixup panic by disabling preemption" and can be used without it.
> > >
> > > How about moving that patch out of the series and processing it separately?
> > Okay.
> >
> > >
> > > As it was pointed out in the discussion of that patch, some other
> > > solution to non-atomic changes of the prologue might be needed anyway.
> > I think you mean Mark Rutland's DYNAMIC_FTRACE_WITH_CALL_OPS. But that
> > still needs to be ready. Let's disable PREEMPT for ftrace first.
>
> FWIW, taking the patch to disable FTRACE with PREEMPT for now makes sense to
> me, too.
Thx, you agree with that.
>
> The DYNAMIC_FTRACE_WITH_CALL_OPS patches should be in v6.3. They're currently
> queued in the arm64 tree in the for-next/ftrace branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/ftrace
> https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/
>
> ... and those *should* be in v6.3.
Glade to hear that. Great!
>
> Patches to imeplement DIRECT_CALLS atop that are in review at the moment:
>
> https://lore.kernel.org/linux-arm-kernel/20230201163420.1579014-1-revest@chromium.org/
Good reference. Thx for sharing.
>
> ... and if riscv uses the CALL_OPS approach, I believe it can do much the same
> there.
>
> If riscv wants to do a single atomic patch to each patch-site (to avoid
> stop_machine()), then direct calls would always needs to bounce through the
> ftrace_caller trampoline (and acquire the direct call from the ftrace_ops), but
> that might not be as bad as it sounds -- from benchmarking on arm64, the bulk
> of the overhead seen with direct calls is when using the list_ops or having to
> do a hash lookup, and both of those are avoided with the CALL_OPS approach.
> Calling directly from the patch-site is a minor optimization relative to
> skipping that work.
Yes, CALL_OPS could solve the PREEMPTION & stop_machine problems. I
would follow up.
The difference from arm64 is that RISC-V is 16bit/32bit mixed
instruction ISA, so we must keep ftrace_caller & ftrace_regs_caller in
2048 aligned. Then:
FTRACE_UPDATE_MAKE_CALL:
* addr+00: NOP // Literal (first 32 bits)
* addr+04: NOP // Literal (last 32 bits)
* addr+08: func: auipc t0, ? // All trampolines are in the 2048
aligned place, so this point won't be changed.
* addr+12: jalr ?(t0) // For different trampolines:
ftrace_regs_caller, ftrace_caller
FTRACE_UPDATE_MAKE_NOP:
* addr+00: NOP // Literal (first 32 bits)
* addr+04: NOP // Literal (last 32 bits)
* addr+08: func: c.j // jump to addr + 16 and skip broken insn & jalr
* addr+10: xxx // last half & broken insn of auipc t0, ?
* addr+12: jalr ?(t0) // To be patched to jalr ?<t0> ()
* addr+16: func body
Right? (The call site would be increased from 64bit to 128bit ahead of func.)
>
> Thanks,
> Mark.
--
Best Regards
Guo Ren
Powered by blists - more mailing lists