lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <a7fc58e4-64c2-77fc-c1dc-f5eb78dbbb01@huawei.com>
Date: Wed, 21 Aug 2024 17:51:27 +0800
From: Kunkun Jiang <jiangkunkun@...wei.com>
To: Thomas Gleixner <tglx@...utronix.de>, Marc Zyngier <maz@...nel.org>,
	Oliver Upton <oliver.upton@...ux.dev>, James Morse <james.morse@....com>,
	Suzuki K Poulose <suzuki.poulose@....com>, Zenghui Yu <yuzenghui@...wei.com>
CC: "open list:IRQ SUBSYSTEM" <linux-kernel@...r.kernel.org>, "moderated
 list:ARM SMMU DRIVERS" <linux-arm-kernel@...ts.infradead.org>,
	<kvmarm@...ts.linux.dev>, "wanghaibin.wang@...wei.com"
	<wanghaibin.wang@...wei.com>, <nizhiqiang1@...wei.com>,
	"tangnianyao@...wei.com" <tangnianyao@...wei.com>, <wangzhou1@...ilicon.com>
Subject: [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same
 time will greatly increase the time consumption

Hi all,

Recently I discovered a problem about GICv4.1, the scenario is as follows:
1. Enable GICv4.1
2. Create multiple VMs.For example, 50 VMs(4U8G)
3. The business running in VMs has a frequent mmio access and need to exit
   to qemu for processing.
4. Or modify the kvm code so that wfi must trap to kvm
5. Then the utilization of pcpu where the vcpu is located will be 100%,and
   basically all in sys.
6. This problem does not exist in GICv3.

According to analysis, this problem is due to the execution of vgic_v4_load.
vcpu_load or kvm_sched_in
     kvm_arch_vcpu_load
     ...
         vgic_v4_load
             irq_set_affinity
             ...
                 irq_do_set_affinity
                     raw_spin_lock(&tmp_mask_lock)
                     chip->irq_set_affinity
                     ...
                       its_vpe_set_affinity

The tmp_mask_lock is the key. This is a global lock. I don't quite 
understand
why tmp_mask_lock is needed here. I think there are two possible 
solutions here:
1. Remove this tmp_mask_lock
2. Modify the gicv4 driver,do not perfrom VMOVP via irq_set_affinity.

Everyone is welcome to discuss.

Thanks,
Kunkun Jiang


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ