[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A14314E.7030406@goop.org>
Date: Wed, 20 May 2009 09:35:26 -0700
From: Jeremy Fitzhardinge <jeremy@...p.org>
To: Ingo Molnar <mingo@...e.hu>
CC: Jan Beulich <JBeulich@...ell.com>,
the arch/x86 maintainers <x86@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Xen-devel <xen-devel@...ts.xensource.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Jesse Barnes <jbarnes@...tuousgeek.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
Ingo Molnar wrote:
> * Jan Beulich <JBeulich@...ell.com> wrote:
>
>
>>>>> Ingo Molnar <mingo@...e.hu> 19.05.09 11:59 >>>
>>>>>
>>> Exactly what is 'bizarre' about using the API defined by the
>>> _CPU_ already, without adding any ad-hoc hypecall? Catch the
>>> dom0 WRMSRs, filter out the MTRR indices - that's it.
>>>
>> But that is *not* the same as using the hypercalls: The hypercall
>> tells Xen "Change all CPUs' MTRRs with the indicated index to the
>> indicated value", while the MSR write says "Change the MTRR with
>> the given index on the physical CPU the current virtual CPU
>> happens to run on to the given value". [...]
>>
>
> The change of MTRR's on _any_ of the guest CPUs in a dom0 context
> should immediately be refected on all CPUs. Assymetric MTRR settings
> are madness.
>
> ( And the thing is, changing MTRRs is fragile and racy on native
> Linux no matter what - even without any hypervisors - due to SMM
> contexts possibly relying on them etc. )
>
>
>> [...] A write-base/write-mask pair may happen to get interrupted
>> (preempted) by the hypervisor, and hence the two writes may happen
>> on different pCPU-s. Teaching the hypervisor to (correctly!) guess
>> what the guest meant in that situation isn't trivial, as then it
>> needs to handle all possible situations (and it can never know
>> whether Dom0 really intended to do something that may look
>> bogus/inconsistent at the first glance). [...]
>>
>
> None of this is a problem really if a sane approach is used: a
> change to the MTRR state on dom0 is applied symmetrically on all
> CPUs.
>
> Or, alternatively, the hypervisor can expose its own administrative
> interface to manage MTRRs.
>
> There's no need to fuglify the Linux kernel for that.
I'm not sure what you mean by that, other than as a description of the
current case. The Xen MTRR hypercall:
1. treats MTRR ranges as allocatable resources, and keep track of how
many uses there are of each
2. updates all physical cpus synchronously (ie, the MTRR is not
presented as a property of dom0's virtual CPU, but as a
system-wide resource)
3. prevents guests from setting inconsistent or conflicting MTRRs
Mapping from MSR writes to this interface is moderately complex, because
it requires a mapping from a low-semantic-content interface to a
high-semantic-content interface. It essentially requires parsing the
MSR writes to map them back to the relatively high-level operations at
the mtrr_ops interface and then present that to Xen.
There are at least a couple of secondary issues which arise from that
approach:
* mtrr/generic.c also has to do a number of other things like
disabling caching, tlb flushes, etc. That adds complexity because
Xen guests are never allowed to globally disable caching, so we'd
have to add additional filtering to remove those cr0 writes
* As we've discussed, we'd need to make the mtrr writes implicitly
change all cpus atomically, as the dom0 kernel can't see physical cpus
The net effect would be that we would be making a pile of apparently
generic CPU operations (MSR writes, control register writes) actually
feed a fairly complex parser, increasing the difference between the Xen
and native cases even more.
mtrr/generic.c about 730 lines of fairly intricate arch-specific code.
mtrr/xen.c is 120 lines of straightforward hypercalls. The mtrr_ops
interface and the Xen hypercall interface are a close semantic match, so
there's very little glue code in there.
But that said, this a huge distraction, an unbelievable amount of noise
for a fairly minor point. We can live without these changes, and
they're certainly easy enough to carry out of tree in the meantime. If
you can't live with these changes, then drop them and we'll work out
something else.
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists