linux-kernel - Re: [PATCH v2] KVM: PPC: Defer vtime accounting 'til after IRQ handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d7f59d0e-eac2-7978-4067-9258c8b1aefe@redhat.com>
Date:   Wed, 13 Oct 2021 11:30:00 +0200
From:   Laurent Vivier <lvivier@...hat.com>
To:     Michael Ellerman <mpe@...erman.id.au>, kvm-ppc@...r.kernel.org
Cc:     Paul Mackerras <paulus@...abs.org>, Greg Kurz <groug@...d.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        linux-kernel@...r.kernel.org, Nicholas Piggin <npiggin@...il.com>,
        linuxppc-dev@...ts.ozlabs.org, stable@...r.kernel.org
Subject: Re: [PATCH v2] KVM: PPC: Defer vtime accounting 'til after IRQ
 handling

On 13/10/2021 01:18, Michael Ellerman wrote:
> Laurent Vivier <lvivier@...hat.com> writes:
>> Commit 112665286d08 moved guest_exit() in the interrupt protected
>> area to avoid wrong context warning (or worse), but the tick counter
>> cannot be updated and the guest time is accounted to the system time.
>>
>> To fix the problem port to POWER the x86 fix
>> 160457140187 ("Defer vtime accounting 'til after IRQ handling"):
>>
>> "Defer the call to account guest time until after servicing any IRQ(s)
>>   that happened in the guest or immediately after VM-Exit.  Tick-based
>>   accounting of vCPU time relies on PF_VCPU being set when the tick IRQ
>>   handler runs, and IRQs are blocked throughout the main sequence of
>>   vcpu_enter_guest(), including the call into vendor code to actually
>>   enter and exit the guest."
>>
>> Fixes: 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs")
>> Cc: npiggin@...il.com
>> Cc: <stable@...r.kernel.org> # 5.12
>> Signed-off-by: Laurent Vivier <lvivier@...hat.com>
>> ---
>>
>> Notes:
>>      v2: remove reference to commit 61bd0f66ff92
>>          cc stable 5.12
>>          add the same comment in the code as for x86
>>
>>   arch/powerpc/kvm/book3s_hv.c | 24 ++++++++++++++++++++----
>>   1 file changed, 20 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 2acb1c96cfaf..a694d1a8f6ce 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
> ...
>> @@ -4506,13 +4514,21 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>   
>>   	srcu_read_unlock(&kvm->srcu, srcu_idx);
>>   
>> +	context_tracking_guest_exit();
>> +
>>   	set_irq_happened(trap);
>>   
>>   	kvmppc_set_host_core(pcpu);
>>   
>> -	guest_exit_irqoff();
>> -
>>   	local_irq_enable();
>> +	/*
>> +	 * Wait until after servicing IRQs to account guest time so that any
>> +	 * ticks that occurred while running the guest are properly accounted
>> +	 * to the guest.  Waiting until IRQs are enabled degrades the accuracy
>> +	 * of accounting via context tracking, but the loss of accuracy is
>> +	 * acceptable for all known use cases.
>> +	 */
>> +	vtime_account_guest_exit();
> 
> This pops a warning for me, running guest(s) on Power8:
>   
>    [  270.745303][T16661] ------------[ cut here ]------------
>    [  270.745374][T16661] WARNING: CPU: 72 PID: 16661 at arch/powerpc/kernel/time.c:311 vtime_account_kernel+0xe0/0xf0

Thank you, I missed that...

My patch is wrong, I have to add vtime_account_guest_exit() before the local_irq_enable().

arch/powerpc/kernel/time.c

  305 static unsigned long vtime_delta(struct cpu_accounting_data *acct,
  306                                  unsigned long *stime_scaled,
  307                                  unsigned long *steal_time)
  308 {
  309         unsigned long now, stime;
  310
  311         WARN_ON_ONCE(!irqs_disabled());
...

But I don't understand how ticks can be accounted now if irqs are still disabled.

Not sure it is as simple as expected...

Thanks,
Laurent