[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4EF1146D.208@siemens.com>
Date: Wed, 21 Dec 2011 00:04:13 +0100
From: Jan Kiszka <jan.kiszka@...mens.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: Alexey Zaytsev <alexey.zaytsev@...il.com>,
"Liu, Jinsong" <jinsong.liu@...el.com>,
Kernel development list <linux-kernel@...r.kernel.org>,
Marcelo Tosatti <mtosatti@...hat.com>,
Avi Kivity <avi@...hat.com>,
"Garrett D'Amore" <garrett@...enta.com>
Subject: Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057:
KVM: emulate lapic tsc deadline timer for guest"
On 2011-12-20 19:58, Linus Torvalds wrote:
> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
> <alexey.zaytsev@...il.com> wrote:
>>
>> Let me clarify the situation.
>>
>> Before this commit, the tsc was advertised in cpuid, and it was
>> handled, if I understand things correctly, by qemu.
>> After this commit, the tsc is advertised in cpuid, and is handled in
>> the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If it does
>> not issue the ioctl, the kernel just discards any wrmsrs done to the
>> tsc. This does not look like an Illumos problem to me. Linux guests
>> kind of work here, because they are prepared to work on utterly
>> broken hardware. Good for you, but please don't break less-prepared
>> guests.
>
> Yes. This is a regression, and needs to be fixed.
>
> Liu, if you don't have time to debug it, we'll just revert the commit.
> It's that easy. Regressions are not allowed. There are no excuses.
>
> In particular, saying "just wait for qemu-kvm" is not an acceptable
> answer, because the point is that things *used* to work, and they
> broke. No "change it to work with the new kernel" allowed, except for
> some *very* rare critical circumstances (usually "major security-bug
> that we had to fix, and people who relied on it are thus out of
> luck").
>
> Commit a3e06bbe8445 still seems to revert cleanly, so that is the easy option.
>
> That said, it sounds like maybe another solution is to start with the
> TSC_DEADLINE timer bit in cpuid cleared, and only setting it after the
> KVM_CREATE_IRQCHIP ioctl.
KVM cpuid discovery can (and is) unfortunately be done before user space
decides about creating an in-kernel irqchip or not. But the patch in
question suggests to user space that it's optimal and perfectly fine to
expose TSC_DEADLINE_TIMER to the guest - obviously a wrong suggestion in
case user space does not use the in-kernel irqchip and does not
implement the deadline timer in its own APIC model.
So the only proper solution I see is to drop that feature flag exposure
from kvm_update_cpuid. Instead, KVM should indicate to *well informed*
(i.e. updated) user space differently that there is now deadline timer
support in the in-kernel APIC model. That different channel should be a
KVM_CAP flag that user space can easily check and then merge the
necessary flag on its own.
>
> In fact, the patch is clearly buggy, in that it apparently doesn't
> emulate TSC_DEADLINE correctly and natively on its own.
>
> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?
Maybe fixing can be done quickly. In any case, the current ABI should
not be exposed to user space, even at the price of postponing it's
introduction. Just my 2 cents, though.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists