linux-kernel - Re: [PATCH] Enhance perf to collect KVM guest os statistics from host side

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BA00E6A.7080903@linux.vnet.ibm.com>
Date:	Tue, 16 Mar 2010 18:04:10 -0500
From:	Anthony Liguori <aliguori@...ux.vnet.ibm.com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	"Frank Ch. Eigler" <fche@...hat.com>, Avi Kivity <avi@...hat.com>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Sheng Yang <sheng@...ux.intel.com>,
	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	Marcelo Tosatti <mtosatti@...hat.com>,
	oerg Roedel <joro@...tes.org>,
	Jes Sorensen <Jes.Sorensen@...hat.com>,
	Gleb Natapov <gleb@...hat.com>,
	Zachary Amsden <zamsden@...hat.com>, ziteng.huang@...el.com
Subject: Re: [PATCH] Enhance perf to collect KVM guest os statistics from
 host side

On 03/16/2010 01:28 PM, Ingo Molnar wrote:
> * Anthony Liguori<aliguori@...ux.vnet.ibm.com>  wrote:
>
>    
>> On 03/16/2010 12:52 PM, Ingo Molnar wrote:
>>      
>>> * Anthony Liguori<aliguori@...ux.vnet.ibm.com>   wrote:
>>>
>>>        
>>>> On 03/16/2010 10:52 AM, Ingo Molnar wrote:
>>>>          
>>>>> You are quite mistaken: KVM isnt really a 'random unprivileged application' in
>>>>> this context, it is clearly an extension of system/kernel services.
>>>>>
>>>>> ( Which can be seen from the simple fact that what started the discussion was
>>>>>    'how do we get /proc/kallsyms from the guest'. I.e. an extension of the
>>>>>    existing host-space /proc/kallsyms was desired. )
>>>>>            
>>>> Random tools (like perf) should not be able to do what you describe. It's a
>>>> security nightmare.
>>>>          
>>> A security nightmare exactly how? Mind to go into details as i dont understand
>>> your point.
>>>        
>> Assume you're using SELinux to implement mandatory access control.
>> How do you label this file system?
>>
>> Generally speaking, we don't know the difference between /proc/kallsyms vs.
>> /dev/mem if we do generic passthrough.  While it might be safe to have a
>> relaxed label of kallsyms (since it's read only), it's clearly not safe to
>> do that for /dev/mem, /etc/shadow, or any file containing sensitive
>> information.
>>      
> What's your _point_? Please outline a threat model, a vector of attack,
> _anything_ that substantiates your "it's a security nightmare" claim.
>    

You suggested "to have a (read only) mount of all guest filesystems".

As I described earlier, not all of the information within the guest 
filesystem has the same level of sensitivity.  If you exposed a generic 
interface like this, it makes it very difficult to delegate privileges.

Delegating privileges is important because from in a higher security 
environment, you may want to prevent a management tool from accessing 
the VM's disk directly, but still allow it to do basic operations (in 
particular, to view performance statistics).

>> Rather, we ought to expose a higher level interface that we have more
>> confidence in with respect to understanding the ramifications of exposing
>> that guest data.
>>      
> Exactly, we want something that has a flexible namespace and works well with
> Linux tools in general. Preferably that namespace should be human readable,
> and it should be hierarchic, and it should have a well-known permission model.
>
> This concept exists in Linux and is generally called a 'filesystem'.
>    

If you want to use a synthetic filesystem as the management interface 
for qemu, that's one thing.  But you suggested exposing the guest 
filesystem in its entirely and that's what I disagreed with.

> If a user cannot read the image file then the user has no access to its
> contents via other namespaces either. That is, of course, a basic security
> aspect.
>
> ( That is perfectly true with a non-SELinux Unix permission model as well, and
>    is true in the SELinux case as well. )
>    

I don't think that's reasonable at all.  The guest may encrypt it's disk 
image.  It still ought to be possible to run perf against that guest, no?

> Erm. Please explain to me, what exactly is 'not that simple' in a MAC
> environment?
>
> Also, i'd like to note that the 'restrictive SELinux setups' usecases are
> pretty secondary.
>
> To demonstrate that, i'd like every KVM developer on this list who reads this
> mail and who has their home development system where they produce their
> patches set up in a restrictive MAC environment, in that you cannot even read
> the images you are using, to chime in with a "I'm doing that" reply.
>    

My home system doesn't run SELinux but I work daily with systems that 
are using SELinux.

I want to be able to run tools like perf on these systems because 
ultimately, I need to debug these systems on a daily basis.

But that's missing the point.  We want to have an interface that works 
for both cases so that we're not maintaining two separate interfaces.

We've rat holed a bit though.  You want:

1) to run perf kvm list and be able to enumerate KVM guests

2) for this to Just Work with qemu guests launched from the command line

You could achieve (1) by tying perf to libvirt but that won't work for 
(2).  There are a few practical problems with (2).

qemu does not require the user to associate any uniquely identifying 
information with a VM.  We've also optimized the command line use case 
so that if all you want to do is run a disk image, you just execute 
"qemu foo.img".  To satisfy your use case, we would either have to force 
a use to always specify unique information, which would be less 
convenient for our users or we would have to let the name be an optional 
parameter.

As it turns out, we already support "qemu -name Fedora foo.img".  What 
we don't do today, but I've been suggesting we should, is automatically 
create a QMP management socket in a well known location based on the 
-name parameter when it's specified.  That would let a tool like perf 
Just Work provided that a user specified -name.

No one uses -name today though and I'm sure you don't either.

The only way to really address this is to change the interaction.  
Instead of running perf externally to qemu, we should support a perf 
command in the qemu monitor that can then tie directly to the perf 
tooling.  That gives us the best possible user experience.

We can't do that though unless perf is a library or is in some way more 
programmatic.

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/