lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150814083857.GA2492@potion.redhat.com>
Date:	Fri, 14 Aug 2015 10:38:58 +0200
From:	Radim Krčmář <rkrcmar@...hat.com>
To:	Paolo Bonzini <pbonzini@...hat.com>
Cc:	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	Steve Rutherford <srutherford@...gle.com>,
	stable@...r.kernel.org
Subject: Re: [PATCH 2/2] KVM: x86: fix edge EOI and IOAPIC reconfig race

2015-08-13 16:53+0200, Paolo Bonzini:
> On 13/08/2015 15:46, Radim Krčmář wrote:
>>  1) IOAPIC inject a vector from i8254
>>  2) guest reconfigures that vector's VCPU and therefore eoi_exit_bitmap
>>     on original VCPU gets cleared
>>  3) guest's handler for the vector does EOI
>>  4) KVM's EOI handler doesn't pass that vector to IOAPIC because it is
>>     not in that VCPU's eoi_exit_bitmap
>>  5) i8254 stops working
>> 
>> This creates an unwanted situation if the vector is reused by a
>> non-IOAPIC source, but I think it is so rare that we don't want to make
>> the solution more sophisticated. 
> 
> What happens if the vector is changed in step 2?
> __kvm_ioapic_update_eoi won't match the redirection table entry.

Yes, the EOI is going to be ignored.  (With APICv, VMX won't even exit.)
In the patch, I dissmissed it as "shouldn't happen in the wild" because
we've always had the vector-change bug :) (Unlike the destination-change
one, which was APICv-only before recent changes.)

A simple solution to the vector-change would have a list of one-time
fixups (vector, *ioapic) and hooks in ioapic reconfig, scan and EOI.

A complex solution would replace ioapic scanning with an array of list
of ioapics (it needs to be a list or small array because vectors can be
shared).
An ioapic would be added to list[vector] on reconfig and removed on
reconfig unless an edge fixup was needed, then it would last til next
EOI  (I guess we won't need to consider vector in IRR and ISR).
Callbacks would update the eoi_exit_bitmap on relevant changes.

I considered doing the complex one, but then it occured to me that we
want the destination-change fixed in stable as APICv machines are
starting to get used and people might migrate old guests on them.

> How do you reproduce the bug?

I run rhel4 (2.6.9) kernel on 2 VCPUs and frequently alternate
smp_affinity of "timer".  The bug is hit within seconds.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ