linux-kernel - Re: [PATCH] kvm: better MWAIT emulation for guests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170313193910.GB4547@potion>
Date:   Mon, 13 Mar 2017 20:39:11 +0100
From:   Radim Krčmář <rkrcmar@...hat.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     linux-kernel@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        kvm@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [PATCH] kvm: better MWAIT emulation for guests

2017-03-13 18:08+0200, Michael S. Tsirkin:
> On Mon, Mar 13, 2017 at 04:46:20PM +0100, Radim Krčmář wrote:
>> 2017-03-10 00:29+0200, Michael S. Tsirkin:
>> > Some guests call mwait without checking the cpu flags.  We currently
>> > emulate that as a NOP but on VMX we can do better: let guest stop the
>> > CPU until timer or IPI.  CPU will be busy but that isn't any worse than
>> > a NOP emulation.
>> > 
>> > Note that mwait within guests is not the same as on real hardware
>> > because you must halt if you want to go deep into sleep.
>> 
>> SDM (25.3 CHANGES TO INSTRUCTION BEHAVIOR IN VMX NON-ROOT OPERATION)
>> says that "MWAIT operates normally".  What is the reason why MWAIT
>> inside VMX cannot reach the same states as MWAIT outside VMX?
> 
> If you are going into a deep sleep state with huge latency you are
> better off exiting and paying an extra microsecond latency
> since a chance some other task will want to schedule seems higher.

Oh, so MWAIT behavior is same and can reach deep sleep, just use-cases
differ ... If the guest VCPU is running on isolated CPU, then you might
want to reach a deep state to save power when there is no better use.

>> >                                                           Thus it isn't
>> > a good idea to use the regular MWAIT flag in CPUID for that.  Add a flag
>> > in the hypervisor leaf instead.
>> > 
>> > Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
>> > ---
>>   [...]
>> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> > @@ -594,6 +594,9 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>> > +		if (this_cpu_has(X86_FEATURE_MWAIT))
>> > +			entry->eax = (1 << KVM_FEATURE_MWAIT);
>> 
>> I'd rather not add it as a paravirt feature:
>> 
>>  - MWAIT requires the software to provide a target state, but we're not
>>    doing anything to expose those states.
> 
> Current linux guests just discover these states based on
> CPU model, so we do expose enough info.

Linux still filters the hardcoded hints through CPUID[5].edx, which is 0
in our case.

>>    The feature would need very constrained setup, which is hard to
>>    support
> 
> Why would it? It works without any tweaking on several boxes
> I own.

MWAIT hints do not always mean the same, so they could lead to different
power/performance tradeoffs than the applications expects.  We should at
least specify that the paravirt feature allows only hint 0.

You probably don't run weird combinations of host/guest CPUs.

>>  - we've had requests to support MWAIT emulation for Linux and fully
>>    emulating MWAIT would be best.
>>    MWAIT is not going to enabled by default, of course; it would be
>>    targeted at LPAR-like uses of KVM.
> 
> Yes I think this limited emulation is safe to enable by default.
> Pretending mwait is equivalent to halt maybe isn't.

Right, we must keep the VCPU thread running when emulating mwait as it
is different from a hlt.

>> What about keeping just the last hunk to improve OS X, for now?
>> 
>> Thanks.
> 
> IMHO if we have a new functionality we are better of creating
> some way for guests to discover it is there.
> 
> Do we really have to argue about a single bit in HV leaf?
> What harm does it do?

It adds code to both guest and hosts and needs documentation ...
The bit is acceptable.  I just see no point in having it when there
already is a detection mechanism for mwait.

In any case, this patch should also remove VM exits under SVM and add
KVM_CAP_MWAIT for userspace.  Userspace can then set the MWAIT feature
if it wishes the guest to use it in a more standard way.

I can do a cleanup due to unused VM exits on top of it.

Thanks.