lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <497906A4.2030008@zytor.com>
Date:	Thu, 22 Jan 2009 15:52:04 -0800
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Zachary Amsden <zach@...are.com>
CC:	Jeremy Fitzhardinge <jeremy@...p.org>,
	Nick Piggin <npiggin@...e.de>, Ingo Molnar <mingo@...e.hu>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"jeremy@...source.com" <jeremy@...source.com>,
	"chrisw@...s-sol.org" <chrisw@...s-sol.org>,
	"rusty@...tcorp.com.au" <rusty@...tcorp.com.au>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Xen-devel <xen-devel@...ts.xensource.com>
Subject: Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT

Zachary Amsden wrote:
> On Thu, 2009-01-22 at 14:49 -0800, H. Peter Anvin wrote:
> 
>> There is also the option to use assembly wrappers to avoid relying on 
>> the calling convention.  This is particularly so since we have sites 
>> where as little as a two-byte instruction gets bloated up with huge 
>> push/pop sequences around a tiny instruction.  Those would be better 
>> served with a direct call to a stub (5 bytes), which would be repatched 
>> to the two-byte instruction + 3 byte nop.
> 
> Yes, for known trivial ops (most!), there isn't any reason to ever have
> a call to begin with; simply an inline instruction sequence would be
> fine, and only those callers that override the sequence would need to
> patch.  It's possible to write clever macros to assure there is always
> space for a 5 byte call.
> 

It's functionally speaking the same thing... the advantage with starting 
out with the call and then patch in the native code as opposed to the 
other way around is to be able to handle things properly before we're 
ready to run the patching code.

Right now a number of the call sites contain a huge push/pop sequence 
followed by an indirect call.  We can patch in the native code to avoid 
the branch overhead, but the register constraints and icache footprint 
is unchanged.

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ