lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de1bbbc9-3d66-a3dd-550f-509032be20ba@linux.alibaba.com>
Date:   Sat, 11 Sep 2021 23:28:32 +0800
From:   Yinan Liu <yinan@...ux.alibaba.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     mark-pk.tsai@...iatek.com, peterz@...radead.org, mingo@...hat.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] scripts: ftrace - move the nop-processing in
 ftrace_init to compile time

This is my GCC version: GCC version 4.8.5 20150623 (red hat 4.8.5-44) 
(GCC) .

In fact, I see the make_nop processing in recordmcount. I'm still 
confused why this part can be directly replaced?


在 2021/9/11 下午10:12, Steven Rostedt 写道:
> On Sat, 11 Sep 2021 21:50:43 +0800
> Yinan Liu <yinan@...ux.alibaba.com> wrote:
>
>> When ftrace is enabled, ftrace_init will consume a period of
>> time, usually around 15~20ms. Approximately 60% of the time is
>> consumed by nop-processing. Moving the nop-processing to the
>> compile time can speed up the kernel boot process.
>>
>> performance test:
>>          env:    Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz
>>          method: before and after patching, compare the
>>                  total time of ftrace_init(), and verify
>>                  the functionality of ftrace.
>>
>>          avg_time of ftrace_init:
>>                  with patch: 7.114ms
>>                  without patch: 15.763ms
> What compiler are you using? Because by default, gcc should already do
> this for you. In fact, recordmcount isn't even called with the latest
> gcc, as gcc creates mcount_loc and inserts nops.
>
> This was implemented before, but because we use to have "ideal nops"
> that was determined at run time, because the different CPUs had
> different efficiency on what nop was used, we had to do it at run time.
> But that is no longer the case today, so we can revisit this.
>
>> Signed-off-by: Yinan Liu <yinan@...ux.alibaba.com>
>> ---
>>   kernel/trace/ftrace.c  |  4 ++++
>>   scripts/recordmcount.h | 14 ++++++++++++++
>>   2 files changed, 18 insertions(+)
>>
>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>> index c236da868990..ae3fba331179 100644
>> --- a/kernel/trace/ftrace.c
>> +++ b/kernel/trace/ftrace.c
>> @@ -6261,6 +6261,10 @@ static int ftrace_process_locs(struct module *mod,
>>   	 * until we are finished with it, and there's no
>>   	 * reason to cause large interrupt latencies while we do it.
>>   	 */
>> +#if defined CONFIG_X86 || defined CONFIG_X86_64 || defined CONFIG_ARM || defined CONFIG_ARM64
> We don't list archs in generic files. The above needs to be something like:
>
> #ifdef ARCH_HAS_MCOUNT_NOP
>
> or some name like that, and then that macro gets defined by the arch
> header (include/asm/ftrace.h)
>
>
>
>> +	ret = 0;
>> +	goto out;
>> +#endif
> space should be here.
>
>>   	if (!mod)
>>   		local_irq_save(flags);
>>   	ftrace_update_code(mod, start_pg);
> -- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ