linux-kernel - Re: Efficient x86 and x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.1.10.0808151535300.24788@gandalf.stny.rr.com>
Date:	Fri, 15 Aug 2008 17:34:37 -0400 (EDT)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Andi Kleen <andi@...stfloor.org>
cc:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Miller <davem@...emloft.net>,
	Roland McGrath <roland@...hat.com>,
	Ulrich Drepper <drepper@...hat.com>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Gregory Haskins <ghaskins@...ell.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	"Luis Claudio R. Goncalves" <lclaudio@...g.org>,
	Clark Williams <williams@...hat.com>
Subject: Re: Efficient x86 and x86_64 NOP microbenchmarks


[ Finally got my goodmis email back ]

On Wed, 13 Aug 2008, Andi Kleen wrote:

> > Sorry to ask, I feel I must be missing something, but I'm trying to
> > figure out where you propose to add the "call mcount" ? In the caller or
> > in the callee ?
> 
> callee like gcc. caller would be likely more bloated because
> there are more calls than functions. Also if it was at the 
> callee more code would be needed because the function currently
> executed couldn't be gotten from stack directly.
> 
> > Or is it a different scheme I don't see ? I am trying to figure out how
> > you happen to do all that without dynamic code modification and manage
> > not to hurt performance.
> 
> The dynamic code modification is only needed because there is no
> global table of the mcount call sites. So instead it discovers
> them at runtime, but that requires runtime save patching

The new code does not discover the places at runtime. The old code did 
that. The "to kill a daemon" removed the runtime discovery and replaced it 
with discovery at compile time.

> 
> With a custom call scheme one could just build up a table of 
> call sites at link time using an ELF section and then when
> tracing is enabled/disabled always patch them all in one go
> in a stop_machine(). Then you wouldn't need parallel execution safe
> patching anymore and it doesn't matter what the nops look like.

The current patch set, pretty much does exactly this. Yes, I patch
at boot up all in one go, before the other CPUS are even active.
This takes all of 6 milliseconds to do. Not much extra time for bootup.

> 
> The other advantage is that it would allow getting rid of
> the frame pointer.

This is the only advantage that you have.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/