linux-kernel - Re: [PATCH RFC V5 00/11] Paravirtualized ticketlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 14 Oct 2011 12:02:43 -0700
From:	Jeremy Fitzhardinge <jeremy@...p.org>
To:	Jason Baron <jbaron@...hat.com>
CC:	Peter Zijlstra <peterz@...radead.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	the arch/x86 maintainers <x86@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Nick Piggin <npiggin@...nel.dk>, Avi Kivity <avi@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	KVM <kvm@...r.kernel.org>, Andi Kleen <andi@...stfloor.org>,
	Xen Devel <xen-devel@...ts.xensource.com>,
	Jeremy Fitzhardinge <jeremy.fitzhardinge@...rix.com>,
	konrad.wilk@...cle.com, rth@...hat.com
Subject: Re: [PATCH RFC V5 00/11] Paravirtualized ticketlocks

On 10/14/2011 11:35 AM, Jason Baron wrote:
> On Fri, Oct 14, 2011 at 10:02:35AM -0700, Jeremy Fitzhardinge wrote:
>> On 10/14/2011 07:17 AM, Jason Baron wrote:
>>> On Thu, Oct 13, 2011 at 09:44:48AM -0700, Jeremy Fitzhardinge wrote:
>>>> pvops is basically a collection of ordinary _ops structures full of
>>>> function pointers, but it has a layer of patching to help optimise it. 
>>>> In the common case, this just replaces an indirect call with a direct
>>>> one, but in some special cases it can inline code.  This is used for
>>>> small, extremely performance-critical things like cli/sti, but it
>>>> awkward to use in general because you have to specify the inlined code
>>>> as a parameterless asm.
>>>>
>>> I haven't look at the pvops patching (probably should), but I was
>>> wondering if jump labels could be used for it? Or is there something
>>> that the pvops patching is doing that jump labels can't handle?
>> Jump labels are essentially binary: you can use path A or path B.  pvops
>> are multiway: there's no limit to the number of potential number of
>> paravirtualized hypervisor implementations.  At the moment we have 4:
>> native, Xen, KVM and lguest.
>>
> Yes, they are binary using the static_branch() interface. But in
> general, the asm goto() construct, allows branching to any number of
> labels. I have implemented the boolean static_branch() b/c it seems like
> the most common interface for jump labels, but I imagine we will
> introduce new interfaces as time goes on. You could of course nest
> static_branch() calls, although I can't say I've tried it.

At the moment we're using pvops to optimise things like:

	(*pv_mmu_ops.set_pte)(...);

To do that with some kind of multiway jump label thing, then that would
need to expand out to something akin to:

	if (static_branch(is_xen))
		xen_set_pte(...);
	else if (static_branch(is_kvm))
		kvm_set_pte(...);
	else if (static_branch(is_lguest))
		lguest_set_pte(...);
	else
		native_set_pte(...);

or something similar with an actual jump table.  But I don't see how it
offers much scope for improvement.

If there were something like:

	STATIC_INDIRECT_CALL(&pv_mmu_ops.set_pte)(...);

where the apparently indirect call is actually patched to be a direct
call, then that would offer a large subset of what we do with pvops.

However, to completely replace pvops patching, the static branch / jump
label mechanism would also need to work in assembler code, and be
capable of actually patching callsites with instructions rather than
just calls (sti/cli/pushf/popf being the most important).

We also keep track of the live registers at the callsite, and compare
that to what registers the target functions will clobber in order to
optimise the amount of register save/restore is needed.  And as a result
we have some pvops functions with non-standard calling conventions to
minimise save/restores on critical paths.

> We could have an interface, that allowed static branch(), to specifiy an
> arbitrary number of no-ops such that call-site itself could look anyway
> we want, if we don't know the bias at compile time. This, of course
> means potentially greater than 1 no-op in the fast path. I assume the
> pvops can have greater than 1 no-op in the fast path. Or is there a
> better solution here?

See above.  But pvops patching is pretty well tuned for its job.

However, I definitely think its worth investigating some way to reduce
the number of patching mechanisms, and if pvops patching doesn't stretch
static jumps in unnatural ways, then perhaps that's the way to go.

Thanks,
    J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/