[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BA7AF2D.7060306@redhat.com>
Date: Mon, 22 Mar 2010 19:55:57 +0200
From: Avi Kivity <avi@...hat.com>
To: Ingo Molnar <mingo@...e.hu>
CC: Anthony Liguori <anthony@...emonkey.ws>,
Pekka Enberg <penberg@...helsinki.fi>,
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Sheng Yang <sheng@...ux.intel.com>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
Marcelo Tosatti <mtosatti@...hat.com>,
oerg Roedel <joro@...tes.org>,
Jes Sorensen <Jes.Sorensen@...hat.com>,
Gleb Natapov <gleb@...hat.com>,
Zachary Amsden <zamsden@...hat.com>, ziteng.huang@...el.com,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Fr?d?ric Weisbecker <fweisbec@...il.com>,
Gregory Haskins <ghaskins@...ell.com>
Subject: Re: [RFC] Unify KVM kernel-space and user-space code into a single
project
On 03/22/2010 07:34 PM, Ingo Molnar wrote:
>
>> The 'something trustable and kernel-provided'. The kernel knows nothing
>> about guest names.
>>
> The kernel certainly knows about other resources such as task names or network
> interface names or tracepoint names. This is kernel design 101.
>
But it doesn't know about guest names. You can't trust task names since
any user can create a task with any name. Network interfaces are root
only so you can trust their names.
There are dozens or even hundreds of object classes the kernel does not
know about and cannot enumerate. User names, for instance. X sessions.
Windows (the screen artifact, not the OS). CIFS shares exported by this
machine. Currently running applications (not processes).
btw, network interfaces would have been much better of using
/dev/netif/name rather than having their own namespace, IMO, like disks.
>>>> [...] I don't like using the term, because sometimes the layers are
>>>> incorrect and need to be violated. But it should be done explicitly, not
>>>> as a shortcut for a minor feature (and profiling is a minor feature, most
>>>> users will never use it, especially guest-from-host).
>>>>
>>>> The fact is we have well defined layers today, kvm virtualizes the cpu
>>>> and memory, qemu emulates devices for a single guest, libvirt manages
>>>> guests. We break this sometimes but there has to be a good reason. So
>>>> perf needs to talk to libvirt if it wants names. Could be done via
>>>> linking, or can be done using a pluging libvirt drops into perf.
>>>>
> This is really just the much-discredited microkernel approach for keeping
> global enumeration data that should be kept by the kernel ...
>
I disagree it should be kept in the kernel. Why introduce a new
namespace, with APIs to query it, manage it, rules regarding conflicts,
then virtualize it for containers.
> Lets look at the ${HOME}/.qemu/qmp/ enumeration method suggested by Anthony.
> There's numerous ways that this can break:
>
I don't like it either. We have libvirt for enumerating guests.
> - Those special files can get corrupted, mis-setup, get out of sync, or can
> be hard to discover.
>
> - The ${HOME}/.qemu/qmp/ solution suggested by Anthony has a very obvious
> design flaw: it is per user. When i'm root i'd like to query _all_ current
> guest images, not just the ones started by root. A system might not even
> have a notion of '${HOME}'.
>
> - Apps might start KVM vcpu instances without adhering to the
> ${HOME}/.qemu/qmp/ access method.
>
- it doesn't work with nfs.
> - There is no guarantee for the Qemu process to reply to a request - while
> the kernel can always guarantee an enumeration result. I dont want 'perf
> kvm' to hang or misbehave just because Qemu has hung.
>
If qemu doesn't reply, your guest is dead anyway.
> Really, for such reasons user-space is pretty poor at doing system-wide
> enumeration and resource management. Microkernels lost for a reason.
>
Take a look at your desktop, userspace is doing all of that everywhere,
from enumerating users and groups, to deciding how your disks are
named. The kernel only provides the bare facilities.
> You are committing several grave design mistakes here.
>
I am committing on the shoulders of giants.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists