lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFwz6cpsaWZ-19h91_CNGB5C3d1bNOv7woxetXOTyJ_CRw@mail.gmail.com>
Date:	Mon, 5 Aug 2013 12:01:01 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"H. Peter Anvin" <hpa@...ux.intel.com>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	LKML <linux-kernel@...r.kernel.org>, gcc <gcc@....gnu.org>,
	Ingo Molnar <mingo@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	David Daney <ddaney.cavm@...il.com>,
	Behan Webster <behanw@...verseincode.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Herbert Xu <herbert@...dor.apana.org.au>
Subject: Re: [RFC] gcc feature request: Moving blocks into sections

On Mon, Aug 5, 2013 at 11:51 AM, H. Peter Anvin <hpa@...ux.intel.com> wrote:
>>
>> Also, how would you pass the parameters? Every tracepoint has its own
>> parameters to pass to it. How would a trap know what where to get "prev"
>> and "next"?
>
> How do you do that now?
>
> You have to do an IP lookup to find out what you are doing.

No, he just generates the code for the call and then uses a static_key
to jump to it. So normally it's all out-of-line, and the only thing in
the hot-path is that 5-byte nop (which gets turned into a 5-byte jump
when the tracing key is enabled)

Works fine, but the normally unused stubs end up mixing in the normal
code segment. Which I actually think is fine, but right now we don't
get the short-jump advantage from it (and there is likely some I$
disadvantage from just fragmentation of the code).

With two-byte jumps, you'd still get the I$ fragmentation (the
argument generation and the call and the branch back would all be in
the same code segment as the hot code), but that would be offset by
the fact that at least the hot code itself could use a short jump when
possible (ie a 2-byte nop rather than a 5-byte one).

Don't know which way it would go performance-wise. But it shouldn't
need gcc changes, it just needs the static key branch/nop rewriting to
be able to handle both sizes. I couldn't tell why Steven's series to
do that was so complex, though - I only glanced through the patches.

            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ