[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJF2gTQGxxgusRgPdNaw4-d+o0a4vefUj7PNpZuym6VKQC4dhw@mail.gmail.com>
Date: Thu, 9 Feb 2023 09:51:03 +0800
From: Guo Ren <guoren@...nel.org>
To: David Laight <David.Laight@...lab.com>
Cc: Mark Rutland <mark.rutland@....com>,
Evgenii Shatokhin <e.shatokhin@...ro.com>,
"suagrfillet@...il.com" <suagrfillet@...il.com>,
"andy.chiu@...ive.com" <andy.chiu@...ive.com>,
"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Guo Ren <guoren@...ux.alibaba.com>,
"anup@...infault.org" <anup@...infault.org>,
"paul.walmsley@...ive.com" <paul.walmsley@...ive.com>,
"palmer@...belt.com" <palmer@...belt.com>,
"conor.dooley@...rochip.com" <conor.dooley@...rochip.com>,
"heiko@...ech.de" <heiko@...ech.de>,
"rostedt@...dmis.org" <rostedt@...dmis.org>,
"mhiramat@...nel.org" <mhiramat@...nel.org>,
"jolsa@...hat.com" <jolsa@...hat.com>, "bp@...e.de" <bp@...e.de>,
"jpoimboe@...nel.org" <jpoimboe@...nel.org>,
"linux@...ro.com" <linux@...ro.com>
Subject: Re: [PATCH -next V7 0/7] riscv: Optimize function trace
On Thu, Feb 9, 2023 at 6:29 AM David Laight <David.Laight@...lab.com> wrote:
>
> > > # Note: aligned to 8 bytes
> > > addr-08 // Literal (first 32-bits) // patched to ops ptr
> > > addr-04 // Literal (last 32-bits) // patched to ops ptr
> > > addr+00 func: mv t0, ra
> > We needn't "mv t0, ra" here because our "jalr" could work with t0 and
> > won't affect ra. Let's do it in the trampoline code, and then we can
> > save another word here.
> > > addr+04 auipc t1, ftrace_caller
> > > addr+08 jalr ftrace_caller(t1)
>
> Is that some kind of 'load high' and 'add offset' pair?
Yes.
> I guess 64bit kernels guarantee to put all module code
> within +-2G of the main kernel?
Yes, 32-bit is enough. So we only need one 32-bit literal size for the
current rv64, just like CONFIG_32BIT.
>
> > Here is the call-site:
> > # Note: aligned to 8 bytes
> > addr-08 // Literal (first 32-bits) // patched to ops ptr
> > addr-04 // Literal (last 32-bits) // patched to ops ptr
> > addr+00 auipc t0, ftrace_caller
> > addr+04 jalr ftrace_caller(t0)
>
> Could you even do something like:
> addr-n call ftrace-function
> addr-n+x literals
> addr+0 nop or jmp addr-n
> addr+4 function_code
Yours cost one more instruction, right?
addr-12 auipc
addr-8 jalr
addr-4 // Literal (32-bits)
addr+0 nop or jmp addr-n // one more?
addr+4 function_code
> So that all the code executed when tracing is enabled
> is before the label and only one 'nop' is in the body.
> The called code can use the return address to find the
> literals and then modify it to return to addr+4.
> The code cost when trace is enabled is probably irrelevant
> here - dominated by what happens later.
> It probably isn't even worth aligning a 64bit constant.
> Doing two reads probably won't be noticable.
>
> What you do want to ensure is that the initial patch is
> overwriting nop - just in case the gap isn't there.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
--
Best Regards
Guo Ren
Powered by blists - more mailing lists