lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d545d25-cf15-a93b-f1a9-88e6775ded29@gmail.com>
Date:   Thu, 14 Dec 2017 22:26:48 +0800
From:   Quan Xu <quan.xu0@...il.com>
To:     "Gonglei (Arei)" <arei.gonglei@...wei.com>,
        "rkrcmar@...hat.com" <rkrcmar@...hat.com>
Cc:     Liran Alon <LIRAN.ALON@...CLE.COM>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Huangweidong (C)" <weidong.huang@...wei.com>
Subject: Re: [PATCH] KVM: x86: ioapic: Clear IRR for rtc bit when rtc EOI
 gotten



On 2017/12/14 21:15, Gonglei (Arei) wrote:
>> -----Original Message-----
>> From: Liran Alon [mailto:LIRAN.ALON@...CLE.COM]
>> Sent: Thursday, December 14, 2017 9:02 PM
>> To: Gonglei (Arei); pbonzini@...hat.com; rkrcmar@...hat.com
>> Cc: kvm@...r.kernel.org; linux-kernel@...r.kernel.org; Huangweidong (C)
>> Subject: Re: [PATCH] KVM: x86: ioapic: Clear IRR for rtc bit when rtc EOI gotten
>>
>>
>>
>> On 14/12/17 14:23, Gonglei wrote:
>>> We hit a bug in our test while run PCMark 10 in a windows 7 VM,
>>> The VM got stuck and the wallclock was hang after several minutes running
>>> PCMark 10 in it.
>>> It is quite easily to reproduce the bug with the upstream KVM and Qemu.
>>>
>>> We found that KVM can not inject any RTC irq to VM after it was hang, it fails
>> to
>>> Deliver irq in ioapic_set_irq() because RTC irq is still pending in ioapic->irr.
>>>
>>> static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
>>>                     int irq_level, bool line_status)
>>> {
>>> ...
>>>            if (!irq_level) {
>>>                     ioapic->irr &= ~mask;
>>>                     ret = 1;
>>>                     goto out;
>>>            }
>>> ...
>>>            if ((edge && old_irr == ioapic->irr) ||
>>>                (!edge && entry.fields.remote_irr)) {
>>>                     ret = 0;
>>>                     goto out;
>>>            }
>>>
>>> According to RTC spec, after RTC injects a High level irq, OS will read CMOS's
>>> register C to to clear the irq flag, and pull down the irq electric pin.
>>>
>>> For Qemu, we will emulate the reading operation in cmos_ioport_read(),
>>> but Guest OS will fire a write operation before to tell which register will be
>> read
>>> after this write, where we use s->cmos_index to record the following register
>> to read.
>>> But in our test, we found that there is a possible situation that Vcpu fails to
>> read
>>> RTC_REG_C to clear irq, This could happens while two VCpus are
>> writing/reading
>>> registers at the same time, for example, vcpu 0 is trying to read RTC_REG_C,
>>> so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C,
>>> but before it tries to read register C, another vcpu1 is going to read
>> RTC_YEAR,
>>> it changes s->cmos_index to RTC_YEAR by a writing action.
>>> The next operation of vcpu0 will be lead to read RTC_YEAR, In this case, we
>> will miss
>>> calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will never inject
>> RTC irq,
>>> and Windows VM will hang.
>> If I understood correctly, this looks to me like a race-condition bug in
>> the Windows guest kernel. In real-hardware this race-condition will also
>> cause the RTC_YEAR to be read instead of RTC_REG_C.
>> Guest kernel should make sure that 2 CPUs does not attempt to read a
>> CMOS register in parallel as they can override each other's cmos_index.
>>
>> See for example how Linux kernel makes sure to avoid such kind of issues
>> in rtc_cmos_read() (arch/x86/kernel/rtc.c) by grabbing a cmos_lock.
>>
> Yes, I knew that.
>
>>> Let's clear IRR of rtc when corresponding EOI is gotten to avoid the issue.
>> Can you elaborate a bit more why it makes sense to put such workaround
>> in KVM code instead of declaring this as guest kernel bug?
>>
> I agree it's a Windows bug. The big problem is there is not problem on Xen platform.

  have you analyzed why it is not a problem on Xen?

Quan


> Thanks,
> -Gonglei
>
>> Regards,
>> -Liran
>>
>>> Suggested-by: Paolo Bonzini <pbonzini@...hat.com>
>>> Signed-off-by: Gonglei <arei.gonglei@...wei.com>
>>> ---
>>>     Thanks to Paolo provides a good solution. :)
>>>
>>>    arch/x86/kvm/ioapic.c | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
>>> index 4e822ad..5022d63 100644
>>> --- a/arch/x86/kvm/ioapic.c
>>> +++ b/arch/x86/kvm/ioapic.c
>>> @@ -160,6 +160,7 @@ static void rtc_irq_eoi(struct kvm_ioapic *ioapic,
>> struct kvm_vcpu *vcpu)
>>>    {
>>>    	if (test_and_clear_bit(vcpu->vcpu_id,
>>>    			       ioapic->rtc_status.dest_map.map)) {
>>> +		ioapic->irr &= ~(1 << RTC_GSI);
>>>    		--ioapic->rtc_status.pending_eoi;
>>>    		rtc_status_pending_eoi_check_valid(ioapic);
>>>    	}
>>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ