[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48DD8C4E.1030103@zytor.com>
Date:	Fri, 26 Sep 2008 18:28:46 -0700
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
CC:	akataria@...are.com, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	the arch/x86 maintainers <x86@...nel.org>, avi@...hat.com,
	Rusty Russell <rusty@...tcorp.com.au>,
	Zachary Amsden <zach@...are.com>,
	Dan Hecht <dhecht@...are.com>, Jun.Nakajima@...el.Com,
	Tim Deegan <Tim.Deegan@...rix.com>
Subject: Re: Use CPUID to communicate with the hypervisor.
Jeremy Fitzhardinge wrote:
> 
> I'm sympathetic to the idea, but it seems a bit under-defined.
> 
> Are you leaving a gap between 0x40000000 and -10 for what?  Future
> extension?  Avoiding existing hypervisor-specific leaves?
> 
> I think there's a move towards doing a scan for a signature, such as
> checking every 16 leaves after 0x40000000 for "a while" looking for
> interesting signatures, so that a hypervisor can support multiple ABIs
> at once.  Given this, it would be better to define a "Generic Hypervisor
> ABI" signature, and put all the related leaves together.
> 
That's kind of iffy, although at least it does have a modicum of being 
controlled.
There is already a de facto standard for doing this: on a (currently) 
64K boundary, add a leaf with a vendor ID and a limit; the presence is 
detectable by the limit in EAX having the proper upper bits.
Then have each vendor pick a range that they maintain.  Intel uses 
0x0000xxxx (although they claim control of the entire numberspace), AMD 
uses 0x8000xxxx, VIA uses 0xC000xxxx, Transmeta used 0x8086xxxx, and 
0x4000xxxx is being reserved for "virtualization".  There are tools 
which use this as a way to try to dump all of CPUID without knowing details.
See the problem here?  This is in effect an unmanaged space.  This means 
that without the vendor ID it is going to be meaningless, unless at 
least the major players in the virtualization industry could agree with 
how to use it, and that would still leave other users out in the cold.
Now, that would still require a vendor numberspace registry.  The 
obvious one is to use the numbers issued by PCI-SIG, which would require 
16 bits -- that would presumably mean numbers of the form 0x40SSSSxx 
with SSSS being the vendor ID; this would require scanning on a 256-byte 
granularity for a generic tool.
Overall, though, *any* generic solution requires buyin from all 
significant players in the space, *AND* a way to distinguish 
noncompliant implementations.  Designing a functional solution is the 
easy part of that[*].  Getting sufficient buyin in the hard part.
> And then, rather than having a simple "maximum leaf", it would be better
> to have cap bits for each specific feature.  For example, how would the
> "RESERVED" registers in "Timing information" ever get used?  How would
> you know that they were no longer reserved, but now meaningful?
Typically you'd define them to be zero unless usable, and define them so 
that a meaningful value would be nonzero.
> That said, I'm a bit worried about the whole idea of having these kinds
> of timing parameters.  It does assume that they're constant for the
> whole life of the VM.  What if they change due to power management or
> migration?
Presumably you'd have to have some way to notify the VM, via an 
interrupt of some sort.
	-hpa
[*] Consider the following totally half-baked example:
CPUID leaf 0x40000000
	ECX-EDX-EBX	Vendor name
	EAX		Max CPUID level supported
	Motivation: existing practice
CPUID leaf 0x40000001...
	EAX		leaf number	Pointer
	ECX		DID:VID		PCI-style
	EDX		0xcc06ab0b	Magic number
	EBX		0x7ab3857a	Magic number
	This would use the PCI vendor ID and an arbitrary "device ID"
	to point to a leaf number, which would then contain information
	starting with an identification/count leaf.  The DID:VID would
	signal who defined the specification, not necessarily who wrote
	the hypervisor.  This is similar to how Intel uses AMD-defined
	CPUID levels, for example.
	-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists