[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1202281149210.23091@kaball-desktop>
Date: Tue, 28 Feb 2012 12:28:29 +0000
From: Stefano Stabellini <stefano.stabellini@...citrix.com>
To: Ian Campbell <Ian.Campbell@...rix.com>
CC: Dave Martin <dave.martin@...aro.org>,
Stefano Stabellini <Stefano.Stabellini@...citrix.com>,
"xen-devel@...ts.xensource.com" <xen-devel@...ts.xensource.com>,
"linaro-dev@...ts.linaro.org" <linaro-dev@...ts.linaro.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"arnd@...db.de" <arnd@...db.de>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
David Vrabel <david.vrabel@...rix.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH-WIP 01/13] xen/arm: use r12 to pass the hypercall number
to the hypervisor
On Tue, 28 Feb 2012, Ian Campbell wrote:
> On Tue, 2012-02-28 at 10:20 +0000, Dave Martin wrote:
> > On Mon, Feb 27, 2012 at 07:33:39PM +0000, Ian Campbell wrote:
> > > On Mon, 2012-02-27 at 18:03 +0000, Dave Martin wrote:
> > > > > Since we support only ARMv7+ there are "T2" and "T3" encodings available
> > > > > which do allow direct mov of an immediate into R12, but are 32 bit Thumb
> > > > > instructions.
> > > > >
> > > > > Should we use r7 instead to maximise instruction density for Thumb code?
> > > >
> > > > The difference seems trivial when put into context, even if you code a
> > > > special Thumb version of the code to maximise density (the Thumb-2 code
> > > > which gets built from assembler in the kernel is very suboptimal in
> > > > size, but there simply isn't a high proportion of asm code in the kernel
> > > > anyway.) I wouldn't consider the ARM/Thumb differences as an important
> > > > factor when deciding on a register.
> > >
> > > OK, that's useful information. thanks.
> > >
> > > > One argument for _not_ using r12 for this purpose is that it is then
> > > > harder to put a generic "HVC" function (analogous to the "syscall"
> > > > syscall) out-of-line, since r12 could get destroyed by the call.
> > >
> > > For an out of line syscall(2) wouldn't the syscall number either be in a
> > > standard C calling convention argument register or on the stack when the
> > > function was called, since it is just a normal argument at that point?
> > > As you point out it cannot be passed in r12 (and could never be, due to
> > > the clobbering).
> > >
> > > The syscall function itself would have to move the arguments and syscall
> > > nr etc around before issuing the syscall.
> > >
> > > I think the same is true of a similar hypercall(2)
> > >
> > > > If you don't think you will ever care about putting HVC out of line
> > > > though, it may not matter.
> >
> > If you have both inline and out-of-line hypercalls, it's hard to ensure
> > that you never have to shuffle the registers in either case.
>
> Agreed.
>
> I think we want to optimise for the inline case since those are the
> majority.
They are not just the majority, all of them are static inline at the
moment, even on x86 (where the number of hypercalls is much higher).
So yes, we should optimize for the inline case.
> The only non-inline case is the special "privcmd ioctl" which is the
> mechanism that allows the Xen toolstack to make hypercalls. It's
> somewhat akin to syscall(2). By the time you get to it you will already
> have done a system call for the ioctl, pulled the arguments from the
> ioctl argument structure etc, plus such hypercalls are not really
> performance critical.
Even the privcmd hypercall (privcmd_call) is a static inline function,
it is just that at the moment there is only one caller :)
> > Shuffling can be reduced but only at the expense of strange argument
> > ordering in some cases when calling from C -- the complexity is probably
> > not worth it. Linux doesn't bother for its own syscalls.
> >
> > Note that even in assembler, a branch from one section to a label in
> > another section may cause r12 to get destroyed, so you will need to be
> > careful about how you code the hypervisor trap handler. However, this
> > is not different from coding exception handlers in general, so I don't
> > know that it constitutes a conclusive argument on its own.
>
> We are happy to arrange that this doesn't occur on our trap entry paths,
> at least until the guest register state has been saved. Currently the
> hypercall dispatcher is in C and gets r12 from the on-stack saved state.
> We will likely eventually optimise the hypercall path directly in ASM
> and in that case we are happy to take steps to ensure we don't clobber
> r12 before we need it.
Yes, I don't think this should be an issue.
> > My instinctive preference would therefore be for r7 (which also seems to
> > be good enough for Linux syscalls) -- but it really depends how many
> > arguments you expect to need to support.
>
> Apparently r7 is the frame pointer for gcc in thumb mode which I think
> is a good reason to avoid it.
>
> We currently have some 5 argument hypercalls and there have been
> occasional suggestions for interfaces which use 6 -- although none of
> them have come to reality.
I don't have a very strong opinion on which register we should use, but
I would like to avoid r7 if it is already actively used by gcc.
The fact that r12 can be destroyed so easily is actually a good argument
for using it because it means it is less likely to contain useful data
that needs to be saved/restored by gcc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists