lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 14 Jan 2022 13:29:46 +0100
From:   Christian Borntraeger <borntraeger@...ux.ibm.com>
To:     Mark Rutland <mark.rutland@....com>
Cc:     linux-kernel@...r.kernel.org, aleksandar.qemu.devel@...il.com,
        alexandru.elisei@....com, anup.patel@....com,
        aou@...s.berkeley.edu, atish.patra@....com,
        benh@...nel.crashing.org, bp@...en8.de, catalin.marinas@....com,
        chenhuacai@...nel.org, dave.hansen@...ux.intel.com,
        david@...hat.com, frankja@...ux.ibm.com, frederic@...nel.org,
        gor@...ux.ibm.com, hca@...ux.ibm.com, imbrenda@...ux.ibm.com,
        james.morse@....com, jmattson@...gle.com, joro@...tes.org,
        kvm@...r.kernel.org, maz@...nel.org, mingo@...hat.com,
        mpe@...erman.id.au, nsaenzju@...hat.com, palmer@...belt.com,
        paulmck@...nel.org, paulus@...ba.org, paul.walmsley@...ive.com,
        pbonzini@...hat.com, seanjc@...gle.com, suzuki.poulose@....com,
        tglx@...utronix.de, tsbogend@...ha.franken.de, vkuznets@...hat.com,
        wanpengli@...cent.com, will@...nel.org
Subject: Re: [PATCH 0/5] kvm: fix latent guest entry/exit bugs



Am 14.01.22 um 13:19 schrieb Mark Rutland:
> On Thu, Jan 13, 2022 at 04:20:07PM +0100, Christian Borntraeger wrote:
>>
>>
>> Am 11.01.22 um 16:35 schrieb Mark Rutland:
>>> Several architectures have latent bugs around guest entry/exit, most
>>> notably:
>>>
>>> 1) Several architectures enable interrupts between guest_enter() and
>>>      guest_exit(). As this period is an RCU extended quiescent state (EQS) this
>>>      is unsound unless the irq entry code explicitly wakes RCU, which most
>>>      architectures only do for entry from usersapce or idle.
>>>
>>>      I believe this affects: arm64, riscv, s390
>>>
>>>      I am not sure about powerpc.
>>>
>>> 2) Several architectures permit instrumentation of code between
>>>      guest_enter() and guest_exit(), e.g. KASAN, KCOV, KCSAN, etc. As
>>>      instrumentation may directly o indirectly use RCU, this has the same
>>>      problems as with interrupts.
>>>
>>>      I believe this affects: arm64, mips, powerpc, riscv, s390
>>>
>>> 3) Several architectures do not inform lockdep and tracing that
>>>      interrupts are enabled during the execution of the guest, or do so in
>>>      an incorrect order. Generally
>>>      this means that logs will report IRQs being masked for much longer
>>>      than is actually the case, which is not ideal for debugging. I don't
>>>      know whether this affects the correctness of lockdep.
>>>
>>>      I believe this affects: arm64, mips, powerpc, riscv, s390
>>>
>>> This was previously fixed for x86 specifically in a series of commits:
>>>
>>>     87fa7f3e98a1310e ("x86/kvm: Move context tracking where it belongs")
>>>     0642391e2139a2c1 ("x86/kvm/vmx: Add hardirq tracing to guest enter/exit")
>>>     9fc975e9efd03e57 ("x86/kvm/svm: Add hardirq tracing on guest enter/exit")
>>>     3ebccdf373c21d86 ("x86/kvm/vmx: Move guest enter/exit into .noinstr.text")
>>>     135961e0a7d555fc ("x86/kvm/svm: Move guest enter/exit into .noinstr.text")
>>>     160457140187c5fb ("KVM: x86: Defer vtime accounting 'til after IRQ handling")
>>>     bc908e091b326467 ("KVM: x86: Consolidate guest enter/exit logic to common helpers")
>>>
>>> But other architectures were left broken, and the infrastructure for
>>> handling this correctly is x86-specific.
>>>
>>> This series introduces generic helper functions which can be used to
>>> handle the problems above, and migrates architectures over to these,
>>> fixing the latent issues.
>>>
>>> I wasn't able to figure my way around powerpc and s390, so I have not
>>
>> I think 2 later patches have moved the guest_enter/exit a bit out.
>> Does this make the s390 code clearer?
> 
> Yes; that's much simpler to follow!
> 
> One major thing I wasn't sure about for s390 is the sequence:
> 
> 	guest_enter_irqoff();	// Enters an RCU EQS
> 	...
> 	local_irq_enable();
> 	...
> 	sie64a(...);
> 	...
> 	local_irq_disable();
> 	...
> 	guest_exit_irqoff();	// Exits an RCU EQS
> 
> ... since if an IRQ is taken between local_irq_{enable,disable}(), RCU won't be
> watching, and I couldn't spot whether your regular IRQ entry logic would wake
> RCU in this case, or whether there was something else I'm missing that saves
> you here.
> 
> For other architectures, including x86 and arm64, we enter the guest with IRQs
> masked and return from the guest with IRQs masked, and don't actually take IRQs
> until we unmask them in the host, after the guest_exit_*() logic has woken RCU
> and so on.
> 
> I wasn't able to find documentation on the semantics of SIE, so I couldn't spot
> whether the local_irq_{enable,disable}() calls were necessary, or could be
> removed.

We run the SIE instruction with interrupts enabled. SIE is interruptible.
The disable/enable pairs are just because  guest_enter/exit_irqoff() require them.
One thing to be aware of: in our entry.S - after an interrupt - we leave SIE by
setting the return address of the interrupt after the sie instruction so that we
get back into this __vcpu_run loop to check for signals and so.


> 
> Thanks,
> Mark.
> 
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 577f1ead6a51..5859207c2cc0 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -4145,10 +4145,6 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>                   * As PF_VCPU will be used in fault handler, between
>>                   * guest_enter and guest_exit should be no uaccess.
>>                   */
>> -               local_irq_disable();
>> -               guest_enter_irqoff();
>> -               __disable_cpu_timer_accounting(vcpu);
>> -               local_irq_enable();
>>                  if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>>                          memcpy(sie_page->pv_grregs,
>>                                 vcpu->run->s.regs.gprs,
>> @@ -4156,8 +4152,16 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>                  }
>>                  if (test_cpu_flag(CIF_FPU))
>>                          load_fpu_regs();
>> +               local_irq_disable();
>> +               __disable_cpu_timer_accounting(vcpu);
>> +               guest_enter_irqoff();
>> +               local_irq_enable();
>>                  exit_reason = sie64a(vcpu->arch.sie_block,
>>                                       vcpu->run->s.regs.gprs);
>> +               local_irq_disable();
>> +               guest_exit_irqoff();
>> +               __enable_cpu_timer_accounting(vcpu);
>> +               local_irq_enable();
>>                  if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>>                          memcpy(vcpu->run->s.regs.gprs,
>>                                 sie_page->pv_grregs,
>> @@ -4173,10 +4177,6 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>                                  vcpu->arch.sie_block->gpsw.mask &= ~PSW_INT_MASK;
>>                          }
>>                  }
>> -               local_irq_disable();
>> -               __enable_cpu_timer_accounting(vcpu);
>> -               guest_exit_irqoff();
>> -               local_irq_enable();
>>                  vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
>>                  rc = vcpu_post_run(vcpu, exit_reason);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ