[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de1bbbc9-3d66-a3dd-550f-509032be20ba@linux.alibaba.com>
Date: Sat, 11 Sep 2021 23:28:32 +0800
From: Yinan Liu <yinan@...ux.alibaba.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: mark-pk.tsai@...iatek.com, peterz@...radead.org, mingo@...hat.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] scripts: ftrace - move the nop-processing in
ftrace_init to compile time
This is my GCC version: GCC version 4.8.5 20150623 (red hat 4.8.5-44)
(GCC) .
In fact, I see the make_nop processing in recordmcount. I'm still
confused why this part can be directly replaced?
在 2021/9/11 下午10:12, Steven Rostedt 写道:
> On Sat, 11 Sep 2021 21:50:43 +0800
> Yinan Liu <yinan@...ux.alibaba.com> wrote:
>
>> When ftrace is enabled, ftrace_init will consume a period of
>> time, usually around 15~20ms. Approximately 60% of the time is
>> consumed by nop-processing. Moving the nop-processing to the
>> compile time can speed up the kernel boot process.
>>
>> performance test:
>> env: Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz
>> method: before and after patching, compare the
>> total time of ftrace_init(), and verify
>> the functionality of ftrace.
>>
>> avg_time of ftrace_init:
>> with patch: 7.114ms
>> without patch: 15.763ms
> What compiler are you using? Because by default, gcc should already do
> this for you. In fact, recordmcount isn't even called with the latest
> gcc, as gcc creates mcount_loc and inserts nops.
>
> This was implemented before, but because we use to have "ideal nops"
> that was determined at run time, because the different CPUs had
> different efficiency on what nop was used, we had to do it at run time.
> But that is no longer the case today, so we can revisit this.
>
>> Signed-off-by: Yinan Liu <yinan@...ux.alibaba.com>
>> ---
>> kernel/trace/ftrace.c | 4 ++++
>> scripts/recordmcount.h | 14 ++++++++++++++
>> 2 files changed, 18 insertions(+)
>>
>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>> index c236da868990..ae3fba331179 100644
>> --- a/kernel/trace/ftrace.c
>> +++ b/kernel/trace/ftrace.c
>> @@ -6261,6 +6261,10 @@ static int ftrace_process_locs(struct module *mod,
>> * until we are finished with it, and there's no
>> * reason to cause large interrupt latencies while we do it.
>> */
>> +#if defined CONFIG_X86 || defined CONFIG_X86_64 || defined CONFIG_ARM || defined CONFIG_ARM64
> We don't list archs in generic files. The above needs to be something like:
>
> #ifdef ARCH_HAS_MCOUNT_NOP
>
> or some name like that, and then that macro gets defined by the arch
> header (include/asm/ftrace.h)
>
>
>
>> + ret = 0;
>> + goto out;
>> +#endif
> space should be here.
>
>> if (!mod)
>> local_irq_save(flags);
>> ftrace_update_code(mod, start_pg);
> -- Steve
Powered by blists - more mailing lists