lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 19 Nov 2009 14:28:06 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	David Daney <ddaney@...iumnetworks.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Haley <aph@...hat.com>,
	Richard Guenther <richard.guenther@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	feng.tang@...el.com, Fr??d??ric Weisbecker <fweisbec@...il.com>,
	Peter Zijlstra <peterz@...radead.org>, jakub@...hat.com,
	gcc@....gnu.org
Subject: Re: BUG: GCC-4.4.x changes the function frame on some functions

On Thu, 2009-11-19 at 11:10 -0800, David Daney wrote:
> Linus Torvalds wrote:

> For the MIPS port of GCC and Linux I recently added the 
> -mmcount-ra-address switch.  It causes the location of the return 
> address (on the stack) to be passed to mcount in a scratch register.

Hehe, scratch register on i686 ;-)

i686 has no extra regs. It just has:

%eax, %ebx, %ecx, %edx - as the general purpose regs
%esp - stack
%ebp - frame pointer
%edi, %esi - counter regs

That's just 8 regs, and half of those are special.

> 
> Perhaps something similar could be done for x86.  It would make this 
> patching of the return location more reliable at the expense of more 
> code at the mcount invocation site.

I rather not put any more code in the call site.

> 
> For the MIPS case the code size doesn't increase, as it is done in the 
> delay slot of the call instruction, which would otherwise be a nop.

I showed in a previous post what the best would be for x86. That is just
calling mcount at the very beginning of the function. The return address
is automatically pushed onto the stack.

Perhaps we could create another profiler? Instead of calling mcount,
call a new function: __fentry__ or something. Have it activated with
another switch. This could make the performance of the function tracer
even better without all these exceptions.

	<function>:
		call __fentry__
		[...]

	
-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ