lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Apr 2019 10:29:13 +0800
From:   "Tangnianyao (ICT)" <tangnianyao@...wei.com>
To:     Marc Zyngier <marc.zyngier@....com>
CC:     <kvmarm@...ts.cs.columbia.edu>, <linux-kernel@...r.kernel.org>,
        <linuxarm@...wei.com>, <guoheyi@...wei.com>, <kvm@...r.kernel.org>,
        <xuwei5@...wei.com>, <wanghaibin.wang@...wei.com>
Subject: Re: [RFC] Question about enable doorbell irq and halt_poll process



On 2019/4/23 18:00, Marc Zyngier wrote:

Hi, Marc

> On Tue, 23 Apr 2019 08:44:17 +0100,
> "Tangnianyao (ICT)" <tangnianyao@...wei.com> wrote:
>>
>> Hi, Marc
> 
> [...]
> 
>> I've learned that there's some implementation problem for the PCIe
>> controller of Hi1616, the processor of D05. The PCIe ACS was not
>> implemented properly and D05 doesn't support VM using pcie vf.
> 
> My D05 completely disagrees with you:
> 
> root@...ssigned-hostname:~# lspci -v
> 00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge
> 	Subsystem: Red Hat, Inc QEMU PCIe Host bridge
> 	Flags: fast devsel
> lspci: Unable to load libkmod resources: error -12
> 
> 00:01.0 Ethernet controller: Red Hat, Inc Virtio network device (rev 01)
> 	Subsystem: Red Hat, Inc Virtio network device
> 	Flags: bus master, fast devsel, latency 0, IRQ 40
> 	Memory at 10040000 (32-bit, non-prefetchable) [size=4K]
> 	Memory at 8000000000 (64-bit, prefetchable) [size=16K]
> 	Expansion ROM at 10000000 [disabled] [size=256K]
> 	Capabilities: [98] MSI-X: Enable+ Count=3 Masked-
> 	Capabilities: [84] Vendor Specific Information: VirtIO: <unknown>
> 	Capabilities: [70] Vendor Specific Information: VirtIO: Notify
> 	Capabilities: [60] Vendor Specific Information: VirtIO: DeviceCfg
> 	Capabilities: [50] Vendor Specific Information: VirtIO: ISR
> 	Capabilities: [40] Vendor Specific Information: VirtIO: CommonCfg
> 	Kernel driver in use: virtio-pci
> 
> 00:02.0 SCSI storage controller: Red Hat, Inc Virtio block device (rev 01)
> 	Subsystem: Red Hat, Inc Virtio block device
> 	Flags: bus master, fast devsel, latency 0, IRQ 41
> 	Memory at 10041000 (32-bit, non-prefetchable) [size=4K]
> 	Memory at 8000004000 (64-bit, prefetchable) [size=16K]
> 	Capabilities: [98] MSI-X: Enable+ Count=2 Masked-
> 	Capabilities: [84] Vendor Specific Information: VirtIO: <unknown>
> 	Capabilities: [70] Vendor Specific Information: VirtIO: Notify
> 	Capabilities: [60] Vendor Specific Information: VirtIO: DeviceCfg
> 	Capabilities: [50] Vendor Specific Information: VirtIO: ISR
> 	Capabilities: [40] Vendor Specific Information: VirtIO: CommonCfg
> 	Kernel driver in use: virtio-pci
> 
> 00:03.0 SCSI storage controller: Red Hat, Inc Virtio SCSI (rev 01)
> 	Subsystem: Red Hat, Inc Virtio SCSI
> 	Flags: bus master, fast devsel, latency 0, IRQ 42
> 	Memory at 10042000 (32-bit, non-prefetchable) [size=4K]
> 	Memory at 8000008000 (64-bit, prefetchable) [size=16K]
> 	Capabilities: [98] MSI-X: Enable+ Count=4 Masked-
> 	Capabilities: [84] Vendor Specific Information: VirtIO: <unknown>
> 	Capabilities: [70] Vendor Specific Information: VirtIO: Notify
> 	Capabilities: [60] Vendor Specific Information: VirtIO: DeviceCfg
> 	Capabilities: [50] Vendor Specific Information: VirtIO: ISR
> 	Capabilities: [40] Vendor Specific Information: VirtIO: CommonCfg
> 	Kernel driver in use: virtio-pci
> 
> 00:04.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
> 	Subsystem: Intel Corporation I350 Ethernet Controller Virtual Function
> 	Flags: bus master, fast devsel, latency 0
> 	Memory at 800000c000 (64-bit, prefetchable) [size=16K]
> 	Memory at 8000010000 (64-bit, prefetchable) [size=16K]
> 	Capabilities: [70] MSI-X: Enable+ Count=3 Masked-
> 	Capabilities: [a0] Express Root Complex Integrated Endpoint, MSI 00
> 	Capabilities: [100] Advanced Error Reporting
> 	Capabilities: [1a0] Transaction Processing Hints
> 	Capabilities: [1d0] Access Control Services
> 	Kernel driver in use: igbvf
> 
> root@...ssigned-hostname:~# uptime
>  05:40:45 up 27 days, 17:16,  1 user,  load average: 4.10, 4.05, 4.01
> 

I have make a new quirk to fix ACS problem on Hi1616, to enable VM using
pcie vf.

> For something that isn't supposed to work, it is pretty good. So
> please test it and let me know how it fares. At this stage, not
> regressing deployed HW is more important than improving the situation
> on HW that nobody has.
> 
>>
>>>
>>>>
>>>>>
>>>>>> Compared to default halt_poll_ns, 500000ns, enabling doorbells is not
>>>>>> large at all.
>>>>>
>>>>> Sure. But I'm not sure this is a universal figure.
>>>>>
>>>>>>
>>>>>>> Frankly, you want to be careful with that. I'd rather enable them late
>>>>>>> and have a chance of not blocking because of another (virtual)
>>>>>>> interrupt, which saves us the doorbell business.
>>>>>>>
>>>>>>> I wonder if you wouldn't be in a better position by drastically
>>>>>>> reducing halt_poll_ns for vcpu that can have directly injected
>>>>>>> interrupts.
>>>>>>>
>>>>>>
>>>>>> If we set halt_poll_ns to a small value, the chance of
>>>>>> not blocking and vcpu scheduled out becomes larger. The cost
>>>>>> of vcpu scheduled out is quite expensive.
>>>>>> In many cases, one pcpu is assigned to only
>>>>>> one vcpu, and halt_poll_ns is set quite large, to increase
>>>>>> chance of halt_poll process got terminated. This is the
>>>>>> reason we want to set halt_poll_ns large. And We hope vlpi
>>>>>> stop halt_poll process in time.
>>>>>
>>>>> Fair enough. It is certainly realistic to strictly partition the
>>>>> system when GICv4 is in use, so I can see some potential benefit.
>>>>>
>>>>>> Though it will check whether vcpu is runnable again by
>>>>>> kvm_vcpu_check_block. Vlpi can prevent scheduling vcpu out.
>>>>>> However it's somewhat later if halt_poll_ns is larger.
>>>>>>
>>>>>>> In any case, this is something that we should measure, not guess.
>>>>>
>>>>> Please post results of realistic benchmarks that we can reproduce,
>>>>> with and without this change. I'm willing to entertain the idea, but I
>>>>> need more than just a vague number.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> 	M.
>>>>>
>>>>
>>>> I tested it with and without change (patch attached in the last).
>>>> halt_poll_ns is keep default, 500000ns.
>>>> I have merged the patch "arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabled"
>>>> to my test kernel, to make sure ICH_HCR_EL2.EN=1 in guest.
>>>>
>>>> netperf result:
>>>> D06 as server, intel 8180 server as client
>>>> with change:
>>>> package 512 bytes - 5400 Mbits/s
>>>> package 64 bytes - 740 Mbits/s
>>>> without change:
>>>> package 512 bytes - 5000 Mbits/s
>>>> package 64 bytes - 710 Mbits/s
>>>>
>>>> Also I have tested D06 as client, intel machine as server,
>>>> with the change, the result remains the same.
>>>
>>> So anywhere between 4% and 8% improvement. Please post results for D05
>>> once you've found one.
>>>
>>>>
>>>>
>>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>>> index 55fe8e2..0f56904 100644
>>>> --- a/virt/kvm/kvm_main.c
>>>> +++ b/virt/kvm/kvm_main.c
>>>> @@ -2256,6 +2256,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>>>>  	if (vcpu->halt_poll_ns) {
>>>>  		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
>>>>
>>>> +#ifdef CONFIG_ARM64
>>>> +		/*
>>>> +		 * When using gicv4, enable doorbell before halt pool wait
>>>> +		 * so that direct vlpi can stop halt poll.
>>>> +		 */
>>>> +		if (vcpu->arch.vgic_cpu.vgic_v3.its_vpe.its_vm) {
>>>> +			kvm_vgic_v4_enable_doorbell(vcpu);
>>>> +		}
>>>> +#endif
>>>> +
>>>
>>> Irk. No. You're now leaving doorbells enabled when going back to the
>>> guest, and that's pretty bad as the whole logic relies on doorbells
>>> being disabled if the guest can run.
>>>
>>> So please try this instead on top of mainline. And it has to be on top
>>> of mainline as we've changed the way timer interrupts work in 5.1 --
>>> see accb99bcd0ca ("KVM: arm/arm64: Simplify bg_timer programming").
>>>
> 
> [...]
> 
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index f25aa98a94df..0ae4ad5dcb12 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -2252,6 +2252,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>>>  	bool waited = false;
>>>  	u64 block_ns;
>>>  
>>> +	kvm_arch_vcpu_blocking(vcpu);
>>> +
>>>  	start = cur = ktime_get();
>>>  	if (vcpu->halt_poll_ns) {
>>>  		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
>>> @@ -2272,8 +2274,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>>>  		} while (single_task_running() && ktime_before(cur, stop));
>>>  	}
>>>  
>>> -	kvm_arch_vcpu_blocking(vcpu);
>>> -
>>>  	for (;;) {
>>>  		prepare_to_swait_exclusive(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
>>>  
>>> @@ -2287,8 +2287,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>>>  	finish_swait(&vcpu->wq, &wait);
>>>  	cur = ktime_get();
>>>  
>>> -	kvm_arch_vcpu_unblocking(vcpu);
>>>  out:
>>> +	kvm_arch_vcpu_unblocking(vcpu);
>>>  	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>>>  
>>>  	if (!vcpu_valid_wakeup(vcpu))
>>>
>>> Thanks,
>>>
>>> 	M.
>>>
>>
>> I've tested your patch and the results showed as follow:
>>
>> netperf result:
>> D06 as server, intel 8180 server as client
>> with change:
>> package 512 bytes - 5500 Mbits/s
>> package 64 bytes - 760 Mbits/s
>> without change:
>> package 512 bytes - 5000 Mbits/s
>> package 64 bytes - 710 Mbits/s
> 
> OK, that's pretty good. Let me know once you've tested it on D05.
> 
> Thanks,
>
> 	M.
>

The average cost of kvm_vgic_v4_enable_doorbell on D05 is 0.74us,
while it is 0.35us on D06.

netperf result:
server: D05 vm using 82599 vf,
client: intel 8180
5.0.0-rc3, have merged the patch "arm64: KVM: Always set ICH_HCR_EL2.EN
if GICv4 is enabled"

with change:
package 512 bytes - 7150 Mbits/s
package 64 bytes - 1080 Mbits/s
without change:
package 512 bytes - 7050 Mbits/s
package 64 bytes - 1080 Mbits/s

It seems not work on D05, as the doorbell enable process costs more than
that on D06.

Thanks,
Tang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ