lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 15 Aug 2008 17:34:37 -0400 (EDT) From: Steven Rostedt <rostedt@...dmis.org> To: Andi Kleen <andi@...stfloor.org> cc: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>, Linus Torvalds <torvalds@...ux-foundation.org>, Jeremy Fitzhardinge <jeremy@...p.org>, LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>, Andrew Morton <akpm@...ux-foundation.org>, David Miller <davem@...emloft.net>, Roland McGrath <roland@...hat.com>, Ulrich Drepper <drepper@...hat.com>, Rusty Russell <rusty@...tcorp.com.au>, Gregory Haskins <ghaskins@...ell.com>, Arnaldo Carvalho de Melo <acme@...hat.com>, "Luis Claudio R. Goncalves" <lclaudio@...g.org>, Clark Williams <williams@...hat.com> Subject: Re: Efficient x86 and x86_64 NOP microbenchmarks [ Finally got my goodmis email back ] On Wed, 13 Aug 2008, Andi Kleen wrote: > > Sorry to ask, I feel I must be missing something, but I'm trying to > > figure out where you propose to add the "call mcount" ? In the caller or > > in the callee ? > > callee like gcc. caller would be likely more bloated because > there are more calls than functions. Also if it was at the > callee more code would be needed because the function currently > executed couldn't be gotten from stack directly. > > > Or is it a different scheme I don't see ? I am trying to figure out how > > you happen to do all that without dynamic code modification and manage > > not to hurt performance. > > The dynamic code modification is only needed because there is no > global table of the mcount call sites. So instead it discovers > them at runtime, but that requires runtime save patching The new code does not discover the places at runtime. The old code did that. The "to kill a daemon" removed the runtime discovery and replaced it with discovery at compile time. > > With a custom call scheme one could just build up a table of > call sites at link time using an ELF section and then when > tracing is enabled/disabled always patch them all in one go > in a stop_machine(). Then you wouldn't need parallel execution safe > patching anymore and it doesn't matter what the nops look like. The current patch set, pretty much does exactly this. Yes, I patch at boot up all in one go, before the other CPUS are even active. This takes all of 6 milliseconds to do. Not much extra time for bootup. > > The other advantage is that it would allow getting rid of > the frame pointer. This is the only advantage that you have. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists