linux-kernel - Re: [PATCH v2] KVM: halt-polling: poll if emulated lapic timer will fire soon

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4b5e2981-17a3-788b-0dac-c2f125765de3@gmail.com>
Date:	Mon, 23 May 2016 09:26:34 +0800
From:	Yang Zhang <yang.zhang.wz@...il.com>
To:	David Matlack <dmatlack@...gle.com>
Cc:	Wanpeng Li <kernellwp@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	kvm list <kvm@...r.kernel.org>,
	Wanpeng Li <wanpeng.li@...mail.com>,
	Paolo Bonzini <pbonzini@...hat.com>,
	Radim Krčmář <rkrcmar@...hat.com>,
	Christian Borntraeger <borntraeger@...ibm.com>
Subject: Re: [PATCH v2] KVM: halt-polling: poll if emulated lapic timer will
 fire soon

On 2016/5/21 2:37, David Matlack wrote:
> On Thu, May 19, 2016 at 7:04 PM, Yang Zhang <yang.zhang.wz@...il.com> wrote:
>> On 2016/5/20 2:36, David Matlack wrote:
>>>
>>> On Thu, May 19, 2016 at 11:01 AM, David Matlack <dmatlack@...gle.com>
>>> wrote:
>>>>
>>>> On Thu, May 19, 2016 at 6:27 AM, Wanpeng Li <kernellwp@...il.com> wrote:
>>>>>
>>>>> From: Wanpeng Li <wanpeng.li@...mail.com>
>>>>>
>>>>> If an emulated lapic timer will fire soon(in the scope of 10us the
>>>>> base of dynamic halt-polling, lower-end of message passing workload
>>>>> latency TCP_RR's poll time < 10us) we can treat it as a short halt,
>>>>> and poll to wait it fire, the fire callback apic_timer_fn() will set
>>>>> KVM_REQ_PENDING_TIMER, and this flag will be check during busy poll.
>>>>> This can avoid context switch overhead and the latency which we wake
>>>>> up vCPU.
>>>>
>>>>
>>>> If I understand correctly, your patch aims to reduce the latency of
>>>> (APIC Timer expires) -> (Guest resumes execution) using halt-polling.
>>>> Let me know if I'm misunderstanding.
>>>>
>>>> In general, I don't think it makes sense to poll for timer interrupts.
>>>> We know when the timer interrupt is going to arrive. If we care about
>>>> the latency of delivering that interrupt to the guest, we should
>>>> program the hrtimer to wake us up slightly early, and then deliver the
>>>> virtual timer interrupt right on time (I think KVM's TSC Deadline
>>>> Timer emulation already does this).
>>>
>>>
>>> (It looks like the way to enable this feature is to set the module
>>> parameter lapic_timer_advance_ns and make sure your guest is using the
>>> TSC Deadline timer instead of the APIC Timer.)
>>
>>
>> This feature is slightly different from current advance expiration way.
>> Advance expiration rely on the VCPU is running(do polling before vmentry).
>> But in some cases, the timer interrupt may be blocked by other thread(i.e.,
>> IF bit is clear) and VCPU cannot be scheduled to run immediately. So even
>> advance the timer early, VCPU may still see the latency. But polling is
>> different, it ensures the VCPU to aware the timer expiration before schedule
>> out.
>>
>>>
>>>> I'm curious to know if this scheme
>>>> would give the same performance improvement to iperf as your patch.
>>>>
>>>> We discussed this a bit before on the mailing list before
>>>> (https://lkml.org/lkml/2016/3/29/680). I'd like to see halt-polling
>>>> and timer interrupts go in the opposite direction: if the next timer
>>>> event (from any timer) is less than vcpu->halt_poll_ns, don't poll at
>>>> all.
>>>>
>>>>>
>>>>> iperf TCP get ~6% bandwidth improvement.
>>>>
>>>>
>>>> Can you explain why your patch results in this bandwidth improvement?
>>
>>
>> It should be reasonable. I have seen the same improvement with ctx switch
>> benchmark: The latency is reduce from ~2600ns to ~2300ns with the similar
>> mechanism.(The same idea but different implementation)
>
> It's not obvious to me why polling for a timer interrupt would improve
> context switch latency. Can you explain a bit more?

We have a workload which using high resolution timer(less than 1ms) 
inside guest. It rely on the timer to wakeup itself. Sometimes the timer 
is expected to fired just after the VCPU is blocked due to execute halt 
instruction. But the thread who is running in the CPU will turn off the 
hardware interrupt for long time due to disk access. This will cause the 
timer interrupt been blocked until the interrupt is re-open.
For optimization, we let VCPU to poll for a while if the next timer will 
arrive soon before schedule out. And the result shows good when running 
several workloads inside guest.

-- 
best regards
yang