[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160217223520.GJ25240@wotan.suse.de>
Date: Wed, 17 Feb 2016 23:35:20 +0100
From: "Luis R. Rodriguez" <mcgrof@...e.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@...cle.com>,
"Luis R. Rodriguez" <mcgrof@...nel.org>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Andrew Cooper <andrew.cooper3@...rix.com>,
David Vrabel <david.vrabel@...rix.com>,
Andy Lutomirski <luto@...capital.net>,
Juergen Gross <jgross@...e.com>,
Michael Brown <mcb30@...e.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andrey Ryabinin <aryabinin@...tuozzo.com>,
Joerg Roedel <joro@...tes.org>,
Robert Moore <robert.moore@...el.com>,
Mauro Carvalho Chehab <mchehab@....samsung.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
Xen Devel <xen-devel@...ts.xensource.com>,
"H. Peter Anvin" <hpa@...or.com>,
Rusty Russell <rusty@...tcorp.com.au>,
Jan Beulich <JBeulich@...e.com>, Lv Zheng <lv.zheng@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
long.wanglong@...wei.com, Fengguang Wu <fengguang.wu@...el.com>,
qiuxishi@...wei.com, Andrey Ryabinin <ryabinin.a.a@...il.com>,
david.e.box@...el.com, X86 ML <x86@...nel.org>,
Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH v2 3/3] paravirt: rename paravirt_enabled to
paravirt_legacy
On Wed, Feb 17, 2016 at 11:03:04PM +0100, Borislav Petkov wrote:
> On Wed, Feb 17, 2016 at 04:21:56PM -0500, Boris Ostrovsky wrote:
> > That's exactly the point: if something is mapped it's an error for a
> > non-PV kernel.
>
> How would something be mapped there? __PAGE_OFFSET is
> 0xffff880000000000.
>
> Or are you thinking about some insanely b0rked kernel code mapping stuff
> in there?
>
> > By removing paravirt_enabled() we may miss those errors. Worse, I think we
> > may even crash while doing pagetable walk (although it's probably better to
> > crash here than to use an unexpected translation in real code somewhere)
>
> Well, if this is the only site which keeps paravirt_enabled() from being
> buried, we need to think about a better way to detect a hypervisor.
> Maybe we should look at x86_hyper, use CPUID(0x4...) or something else.
>
> What's your preference?
I can think of two possibilities but lets also address _why_ we can't
replace it: the semantic gap.
1) Can't we just set_memory_np() to cause a page fault on that range
to catch invalid access?
2) Add hypervisor type
I've been lobbying for a new boot protocol hypervisor type and hypervisor data
pointer extensions, much in the same way subarch and subarch_data was added for
x86 boot protocol 2.07. This would be an extension to help fill in the gaps,
it would also make it accessible to super early code and could help those of
us who care about super clean inits to work towards these goals. Right now we
can rely on the subarch for most Xen concerns but with Xen HVM and future Xen
HVMLite things get more complex and as noted by hpa the subarch was not
designed as a 'hypervisor type', such a thing should be considered separateley.
So we'd knock about 2-3 birds with 1 stone here:
a) there is a semantic gap between early access to hypervisor type of
code between asm boot and when setup_arch() is called, only after
init_hypervisor_platform() is called (in setup_arch()) can things
such as cpu_has_hypervisor() or derivatives such as kvm_para_available()
be correctly used, as only then will that information be correct across
the board. Part of this issue is what gave rise to paravirt_enabled()
hackeries in the first place. We have subarch but that is not to be used
to set things such as hypervisor type, I'll soon post a patch to clarify
the exact use case for the subarch.
Since we have no uniform way to detect hypervisor types we are starting
to see custom hacks. For instance of a hack propagated to drivers now, see
sound/pci/intel8x0.c use of kvm_para_available().
b) help put an end to paravirt_enabled() for cases we can't replace
c) provide an early access mechanism to hypervisor type. This should
help towards unifying inits by enabling further stubs on early and
post routine calls. This is future long term possible work:
With a hypervisor type and hypervisor custom data pointer, we'd strive
to work to make xen use the same startup_32() and startup_64() entries,
with stubs possible using the hypervisor type at the start /end as
follows:
startup_32() startup_64()
| |
| |
V V
pre_hypervisor_stub_32() pre_hypervisor_stub_64()
| |
| |
V V
[existing startup_32()] [existing startup_64()]
| |
| |
V V
post_hypervisor_stub_32() post_hypervisor_stub_64()
The pre_hypervisor_stub_32() would have much of the code of
the newly proposed hvmlite_start_xen() but for 32-bit, pre_hypervisor_stub_64()
would have the 64-bits.
I realize Andrew has not been a fan over the idea of Xen setting on the
zero page *any* parameter but he's also noted an alternative is to just
lobby for Grub to boot Xen kernels and then we can rely on Grub to
set it for Linux. The only issue with this is it seems this doesn't
address stubdomains which don't use grub?
Luis
Powered by blists - more mailing lists