[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6fd8266-54a8-bb4b-f3d2-643a94e27e9e@intel.com>
Date: Thu, 13 Jul 2023 21:32:11 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Wang Jianchao <jianchwa@...look.com>,
Zhi Wang <zhi.wang.linux@...il.com>
Cc: seanjc@...gle.com, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org,
hpa@...or.com, kvm@...r.kernel.org, arkinjob@...look.com,
linux-kernel@...r.kernel.org
Subject: Re: [RFC 0/3] KVM: x86: introduce pv feature lazy tscdeadline
On 7/13/2023 10:50 AM, Wang Jianchao wrote:
>
>
> On 2023.07.13 02:14, Zhi Wang wrote:
>> On Fri, 7 Jul 2023 14:17:58 +0800
>> Wang Jianchao <jianchwa@...look.com> wrote:
>>
>>> Hi
>>>
>>> This patchset attemps to introduce a new pv feature, lazy tscdeadline.
>>> Everytime guest write msr of MSR_IA32_TSC_DEADLINE, a vm-exit occurs
>>> and host side handle it. However, a lot of the vm-exit is unnecessary
>>> because the timer is often over-written before it expires.
>>>
>>> v : write to msr of tsc deadline
>>> | : timer armed by tsc deadline
>>>
>>> v v v v v | | | | |
>>> ---------------------------------------> Time
>>>
>>> The timer armed by msr write is over-written before expires and the
>>> vm-exit caused by it are wasted. The lazy tscdeadline works as following,
>>>
>>> v v v v v | |
>>> ---------------------------------------> Time
>>> '- arm -'
>>>
>>
>> Interesting patch.
>>
>> I am a little bit confused of the chart above. It seems the write of MSR,
>> which is said to cause VM exit, is not reduced in the chart of lazy
>> tscdeadline, only the times of arm are getting less. And the benefit of
>> lazy tscdeadline is said coming from "less vm exit". Maybe it is better
>> to imporve the chart a little bit to help people jump into the idea
>> easily?
>
> Thanks so much for you comment and sorry for my poor chart.
>
> Let me try to rework the chart.
>
> Before this patch, every time guest start or modify a hrtimer, we need to write the msr of tsc deadline,
> a vm-exit occurs and host arms a hv or sw timer for it.
>
>
> w: write msr
> x: vm-exit
> t: hv or sw timer
>
>
> Guest
> w
> ---------------------------------------> Time
> Host x t
>
>
> However, in some workload that needs setup timer frequently, msr of tscdeadline is usually overwritten
> many times before the timer expires. And every time we modify the tscdeadline, a vm-exit ocurrs
>
>
> 1. write to msr with t0
>
> Guest
> w0
> ----------------------------------------> Time
> Host x0 t0
>
>
> 2. write to msr with t1
> Guest
> w1
> ------------------------------------------> Time
> Host x1 t0->t1
>
>
> 2. write to msr with t2
> Guest
> w2
> ------------------------------------------> Time
> Host x2 t1->t2
>
>
> 3. write to msr with t3
> Guest
> w3
> ------------------------------------------> Time
> Host x3 t2->t3
>
>
>
> What this patch want to do is to eliminate the vm-exit of x1 x2 and x3 as following,
>
>
> Firstly, we have two fields shared between guest and host as other pv features, saying,
> - armed, the value of tscdeadline that has a timer in host side, only updated by __host__ side
> - pending, the next value of tscdeadline, only updated by __guest__ side
>
>
> 1. write to msr with t0
>
> armed : t0
> pending : t0
> Guest
> w0
> ----------------------------------------> Time
> Host x0 t0
>
> vm-exit occurs and arms a timer for t0 in host side
What's the initial value of @armed and @pending?
>
> 2. write to msr with t1
>
> armed : t0
> pending : t1
>
> Guest
> w1
> ------------------------------------------> Time
> Host t0
>
> the value of tsc deadline that has been armed, namely t0, is smaller than t1, needn't to write
> to msr but just update pending
if t1 < t0, then it triggers the vm exit, right?
And in this case, I think @armed will be updated to t1. What about
pending? will it get updated to t1 or not?
>
> 3. write to msr with t2
>
> armed : t0
> pending : t2
>
> Guest
> w2
> ------------------------------------------> Time
> Host t0
>
> Similar with step 2, just update pending field with t2, no vm-exit
>
>
> 4. write to msr with t3
>
> armed : t0
> pending : t3
>
> Guest
> w3
> ------------------------------------------> Time
> Host t0
> Similar with step 2, just update pending field with t3, no vm-exit
>
>
> 5. t0 expires, arm t3
>
> armed : t3
> pending : t3
>
>
> Guest
>
> ------------------------------------------> Time
> Host t0 ------> t3
>
> t0 is fired, it checks the pending field and re-arm a timer based on it.
>
>
> Here is the core ideal of this patch ;)
>
>
> Thanks
> Jianchao
>
>>
>>> The 1st timer is responsible for arming the next timer. When the armed
>>> timer is expired, it will check pending and arm a new timer.
>>>
>>> In the netperf test with TCP_RR on loopback, this lazy_tscdeadline can
>>> reduce vm-exit obviously.
>>>
>>> Close Open
>>> --------------------------------------------------------
>>> VM-Exit
>>> sum 12617503 5815737
>>> intr 0% 37023 0% 33002
>>> cpuid 0% 1 0% 0
>>> halt 19% 2503932 47% 2780683
>>> msr-write 79% 10046340 51% 2966824
>>> pause 0% 90 0% 84
>>> ept-violation 0% 584 0% 336
>>> ept-misconfig 0% 0 0% 2
>>> preemption-timer 0% 29518 0% 34800
>>> -------------------------------------------------------
>>> MSR-Write
>>> sum 10046455 2966864
>>> apic-icr 25% 2533498 93% 2781235
>>> tsc-deadline 74% 7512945 6% 185629
>>>
>>> This patchset is made and tested on 6.4.0, includes 3 patches,
>>>
>>> The 1st one adds necessary data structures for this feature
>>> The 2nd one adds the specific msr operations between guest and host
>>> The 3rd one are the one make this feature works.
>>>
>>> Any comment is welcome.
>>>
>>> Thanks
>>> Jianchao
>>>
>>> Wang Jianchao (3)
>>> KVM: x86: add msr register and data structure for lazy tscdeadline
>>> KVM: x86: exchange info about lazy_tscdeadline with msr
>>> KVM: X86: add lazy tscdeadline support to reduce vm-exit of msr-write
>>>
>>>
>>> arch/x86/include/asm/kvm_host.h | 10 ++++++++
>>> arch/x86/include/uapi/asm/kvm_para.h | 9 +++++++
>>> arch/x86/kernel/apic/apic.c | 47 ++++++++++++++++++++++++++++++++++-
>>> arch/x86/kernel/kvm.c | 13 ++++++++++
>>> arch/x86/kvm/cpuid.c | 1 +
>>> arch/x86/kvm/lapic.c | 128 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
>>> arch/x86/kvm/lapic.h | 4 +++
>>> arch/x86/kvm/x86.c | 26 ++++++++++++++++++++
>>> 8 files changed, 229 insertions(+), 9 deletions(-)
>>
Powered by blists - more mailing lists