linux-kernel - Re: [RFC] Unify KVM kernel-space and user-space code into a single project

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4BAA2BF7.4060407@redhat.com>
Date:	Wed, 24 Mar 2010 17:12:55 +0200
From:	Avi Kivity <avi@...hat.com>
To:	Joerg Roedel <joro@...tes.org>
CC:	Anthony Liguori <anthony@...emonkey.ws>,
	Ingo Molnar <mingo@...e.hu>,
	Pekka Enberg <penberg@...helsinki.fi>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Sheng Yang <sheng@...ux.intel.com>,
	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Jes Sorensen <Jes.Sorensen@...hat.com>,
	Gleb Natapov <gleb@...hat.com>, ziteng.huang@...el.com,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Fr?d?ric Weisbecker <fweisbec@...il.com>,
	Gregory Haskins <ghaskins@...ell.com>
Subject: Re: [RFC] Unify KVM kernel-space and user-space code into a single
 project

On 03/24/2010 05:01 PM, Joerg Roedel wrote:
>
>> But when I weigh the benefit of truly transparent  system-wide perf
>> integration for users who don't use libvirt but do use  perf, versus
>> the cost of transforming kvm from a single-process API to a
>> system-wide API with all the complications that I've listed, it comes
>> out in favour of not adding the API.
>>      
> Its not a transformation, its an extension. The current per-process
> /dev/kvm stays mostly untouched. Its all about having something like
> this:
>
> $ cd /sys/kvm/guest0
> $ ls -l
> -r-------- 1 root root 0 2009-08-17 12:05 name
> dr-x------ 1 root root 0 2009-08-17 12:05 fs
> $ cat name
> guest0
> $ # ...
>
> The fs/ directory is used as the mount point for the guest root fs.
>    

The problem is /sys/kvm, not /sys/kvm/fs.

>>> The samples will be tagged with the guest-name (and some additional
>>> information perf needs). Perf userspace can access the symbols then
>>> through /sys/kvm/guest0/fs/...
>>>        
>> I take that as a yes?  So we need a virtio-serial client in the kernel
>> (which might be exploitable by a malicious guest if buggy) and a
>> fs-over-virtio-serial client in the kernel (also exploitable).
>>      
> What I meant was: perf-kernel puts the guest-name into every sample and
> perf-userspace accesses /sys/kvm/guest_name/fs/ later to resolve the
> symbols. I leave the question of how the guest-fs is exposed to the host
> out of this discussion. We should discuss this seperatly.
>    

How I see it: perf-kernel puts the guest pid into every sample, and 
perf-userspace uses that to resolve to a mountpoint served by fuse, or 
to a unix domain socket that serves the files.

>>> An approach like: "The files are owned and only readable by the same
>>> user that started the vm." might be a good start. So a user can measure
>>> its own guests and root can measure all of them.
>>>        
>> That's not how sVirt works.  sVirt isolates a user's VMs from each
>> other, so if a guest breaks into qemu it can't break into other guests
>> owned by the same user.
>>      
> If a vm breaks into qemu it can access the host file system which is the
> bigger problem. In this case there is no isolation anymore. From that
> context it can even kill other VMs of the same user independent of a
> hypothetical /sys/kvm/.
>    

It cannot.  sVirt labels the disk image and other files qemu needs with 
the appropriate label, and everything else is off limits.  Even if you 
run the guest as root, it won't have access to other files.

>>> Yeah that would be interesting information. But it is more related to
>>> tracing than to pmu measurements.  The information which you
>>> mentioned above are probably better captured by an extension of
>>> trace-events to userspace.
>>>        
>> It's all related.  You start with perf, see a problem with mmio, call up
>> a histogram of mmio or interrupts or whatever, then zoom in on the
>> misbehaving device.
>>      
> Yes, but its different from the implementation point-of-view. For the
> user it surely all plays together.
>    

We need qemu to cooperate for mmio tracing, and we can cooperate with 
qemu for symbol resolution.  If it prevents adding another kernel API, 
that's a win from my POV.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/