linux-kernel - Re: [RESEND 2][PATCH 4/4] Modify KVM to update guest time accounting.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <47139959.7090900@qumranet.com>
Date:	Mon, 15 Oct 2007 18:46:17 +0200
From:	Avi Kivity <avi@...ranet.com>
To:	Laurent Vivier <Laurent.Vivier@...l.net>
CC:	Ingo Molnar <mingo@...e.hu>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RESEND 2][PATCH 4/4] Modify KVM to update guest time	accounting.

Laurent Vivier wrote:
> Avi Kivity wrote:
>   
>> Laurent Vivier wrote:
>>
>>  
>>
>>     
>>>> But if we didn't get an interrupt in that time?
>>>>
>>>> We can clear it a bit later, after local_irq_enable() in
>>>> __vcpu_run(). However we need a nop instruction first because "sti"
>>>> keeps interrupts
>>>> disabled for one more instruction.
>>>>     
>>>>         
>>> IMHO, I think it is better to let kvm_guest_exit() empty (you can
>>> remove it, if
>>> you want):
>>>
>>> 1st case:
>>> - unset PF_VCPU in kvm_guest_exit(), all the tick is always for system
>>> time.
>>> Guest time is always 0.
>>>
>>> 1st case and half:
>>>
>>> - like 1st case but we move kvm_guest_exit() as you propose and the
>>> reason of
>>> the interrupt is the tick interrupt. The tick is for guest time only.
>>> I think
>>> the probability is very low.
>>>   
>>>       
>> If the guest is executing for 10% of the time, the probability is
>> exactly 10%, no?
>>     
>
> I think you know that better than me.
>
> But is there homogeneity in probability ?
>   

It's exactly the same issue as with systime and usertime.  The interrupt 
samples the program counter at various points at a fairly low frequency 
(milliseconds) while syscalls last a few dozens of microseconds.  
Probability makes it average out correctly in the end.

[Ingo, what about dyntick?  suppose you have just one process that calls 
read() from /dev/zero repeatedly.  There'd be very few (or no) 
interrupts -- what happens to accounting accuracy?]

> I mean, if the guest has a lot I/O, it is interrupted by them and the
> probability to be interrupted by a tick is lower than the time passed in the VCPU ?
>   

Suppose the time to service the I/O is exactly equal to the amount 
running in guest mode.   Then the probability of the interrupt happening 
in guest mode is equal to it happening outside guest mode and you'd get 
50% guest, 50% system/user, which is what you want.


>   
>>> 2nd case:
>>> - don't unset PF_VCPU in kvm_guest_exit(), all the tick is for guest
>>> time.
>>>   
>>>       
>> But then even execution in ->handle_exit() is accounted as guest time,
>> which is wrong.
>>     
>
> System time and User time are wrong too as the tick is accounted to the side
> where it appears, even if CPU has executed code from the other side in a
> sub-part of the tick. It's not a good argument.
>   

It's at least consistent... the same errors for everyone, so it averages 
out in the end.

-- 
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/