[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160415021422.GB6956@localhost.localdomain>
Date: Thu, 14 Apr 2016 22:14:22 -0400
From: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To: "Luis R. Rodriguez" <mcgrof@...nel.org>
Cc: George Dunlap <george.dunlap@...rix.com>,
Matt Fleming <matt@...eblueprint.co.uk>, jeffm@...e.com,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Jim Fehlig <jfehlig@...e.com>, Jan Beulich <JBeulich@...e.com>,
"H. Peter Anvin" <hpa@...or.com>,
Daniel Kiper <daniel.kiper@...cle.com>,
the arch/x86 maintainers <x86@...nel.org>,
Takashi Iwai <tiwai@...e.de>,
Vojtěch Pavlík <vojtech@...e.cz>,
Gary Lin <GLin@...e.com>,
xen-devel <xen-devel@...ts.xenproject.org>,
Jeffrey Cheung <JCheung@...e.com>,
Charles Arndol <carnold@...e.com>,
Julien Grall <julien.grall@....com>,
Stefano Stabellini <stefano.stabellini@...citrix.com>,
joeyli <jlee@...e.com>, Borislav Petkov <bp@...en8.de>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Juergen Gross <jgross@...e.com>,
Andrew Cooper <andrew.cooper3@...rix.com>,
Michael Chang <MChang@...e.com>,
Andy Lutomirski <luto@...capital.net>,
David Vrabel <david.vrabel@...rix.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Roger Pau Monné <roger.pau@...rix.com>
Subject: Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry
On Thu, Apr 14, 2016 at 11:12:01PM +0200, Luis R. Rodriguez wrote:
> On Thu, Apr 14, 2016 at 04:38:47PM -0400, Konrad Rzeszutek Wilk wrote:
> > > This has nothing to do with dominance or anything nefarious, I'm asking
> > > simply for a full engineering evaluation of all possibilities, with
> > > the long term in mind. Not for now, but for hardware assumptions which
> > > are sensible 5 years from now.
> >
> > There are two different things in my mind about this conversation:
> >
> > 1). semantics of low-level code wrapped around pvops. On baremetal
> > it is easy - just look at Intel and AMD SDM.
> > And this is exactly what running in HVM or HVMLite mode will do -
> > all those low-level operations will have the same exact semantic
> > as baremetal.
>
> Today Linux is KVM stupid for early boot code. I've pointed this out
-EPARSE?
> before, but again, there has been no reason found to need this. Perhaps
> for HVMLite we won't need this...
Are you talking about kvmtools? Which BTW are similar to how HVMLite
would expose the platform.
>
> > There is no hope for the pv_ops to fix that.
>
> Actually I beg to differ. See my patches and ongoing work.
I meant in terms of semantics. As in I cannot see some of
those pv-ops to have the same semantics as baremetal. For example
set_pte is simple on x86 (movq $<some value>, <memory address>).
While on Xen PV it is a potential batching hypercall with
lookup in an P2M table, then perhaps a sidelong look at
the M2P, then maybe the M2P override.
>
> > And I am pretty sure the HVMLite in 5 years will have no
> > trouble in this as it will be running in VMX mode (HVM).
>
> HVMLite may still use PV drivers for some things, its not super
> obvious to me that low level semantics will not be needed yet.
PV drivers are very different from low-level semantics.
And it will have to use them.
Maybe it is easier to think of this in terms of kvmtool - it
is pretty much how this would work - but instead of VirtIO
drivers you would be using the Xen PV drivers (thought one
could also use VirtIO ones if you wanted).
>
> > 2). Boot entry.
> >
> > The semantics on Linux are well known - they are documented in
> > Documentation/x86/boot.txt.
> >
> > HVMLite Linux guests have to somehow provide that.
> >
> > And how it is done seems to be tied around:
> >
> > a) Use existing boot paths - which means making some
> > extra stub code to call in those existing boot paths
> > (for example Xen could bundle with an GRUB2-alike
> > code to be run when booting Linux using that boot-path).
> >
> > Or EFI (for a ton more code). Granted not all OSes
> > support those, so not very OS agnostic.
>
> What other OSes do is something to consider but if they don't
> do it because they are slacking in one domain should by no means
> be a reason to not evaluate the long term possible gains.
> Specially if we have reasons to believe more architectures will
> consider it and standardize on it.
>
> It'd be silly not to take this a bit more seriously.
Complexity vs simplicity.
>
> > Hard part - if the bootparams change then have to
> > rev up the code in there. May be out of sync
> > with Linux bootparams.
>
> If we are going to ultimately standardize on EFI boot for new
> hardware it'd be rather silly to extend the boot params further.
Whoa there... Have you spoken to hpa,tglrx about this?
>
> > b) Add another simpler boot entry point which has to copy
> > "some" strings from its format in bootparams.
> >
> >
> > So this part of the discussion does not fall in the
> > hardware assumptions. Intel SDM or AMD mention nothing about
> > boot loaders or how to boot an OS - that is all in realms
> > of how software talks to software.
>
> Right -- so one question to ask here is what other uses are there
> for this outside of say HVMLite. You mentioned Multiboot so far.
>
> > 3). And there is the discussion on man-power to make this
> > happen.
>
> Sure.
>
> > 4). Lastly which one is simpler and involves less code so
> > that there is a less chance of bitrot.
>
> Indeed.
>
> You also forgot the tie-in between dead-code and semantics but
Wait, I just spoke about CPU semantics?! Which semantics
are you talking about?
> that clearly is not on your mind. But I'd say this is a good
> summary.
I put 'dead code' in the same realm as device drivers work.
And they seem to always have some issue or another.
Or maybe I getting unlucky and getting copied on those bugs.
>
> Luis
Powered by blists - more mailing lists